summaryrefslogtreecommitdiff
path: root/src/kernels/level3/xgemm_part1.opencl
AgeCommit message (Collapse)Author
2017-07-08Made the inline keyword in kernels optional currently only enabled for ↵Cedric Nugteren
NVIDIA and ARM GPUs
2016-09-12Split the XGEMM kernel further up: now in 3 parts. This is done because MSVC ↵Cedric Nugteren
can't handle long strings
2016-06-08Added global memory synchronisation for better cache performance on ARM Mali ↵Cedric Nugteren
GPUs
2016-05-15Added support for staggered/shuffled offsets for GEMM to improve performance ↵cnugteren
for large power-of-2 kernels on AMD GPUs
2016-02-08Separated the GEMM kernel in two parts to reduce string length for MSVCCedric Nugteren