Age | Commit message (Collapse) | Author | |
---|---|---|---|
2017-07-08 | Made the inline keyword in kernels optional currently only enabled for ↵ | Cedric Nugteren | |
NVIDIA and ARM GPUs | |||
2016-09-12 | Split the XGEMM kernel further up: now in 3 parts. This is done because MSVC ↵ | Cedric Nugteren | |
can't handle long strings | |||
2016-06-08 | Added global memory synchronisation for better cache performance on ARM Mali ↵ | Cedric Nugteren | |
GPUs | |||
2016-05-15 | Added support for staggered/shuffled offsets for GEMM to improve performance ↵ | cnugteren | |
for large power-of-2 kernels on AMD GPUs | |||
2016-02-08 | Separated the GEMM kernel in two parts to reduce string length for MSVC | Cedric Nugteren | |