Age | Commit message (Expand) | Author |
---|---|---|
2017-12-07 | Added register promotion to the main GEMM kernel | Cedric Nugteren |
2017-12-03 | Added GEMM (direct and in-direct) to the pre-processor testing; modified the ... | Cedric Nugteren |
2017-10-14 | Make local memory pointers a define in OpenCL; some fixes to the recently cha... | Cedric Nugteren |
2017-07-08 | Made the inline keyword in kernels optional currently only enabled for NVIDIA... | Cedric Nugteren |
2016-09-12 | Split the XGEMM kernel further up: now in 3 parts. This is done because MSVC ... | Cedric Nugteren |
2016-06-08 | Added global memory synchronisation for better cache performance on ARM Mali ... | Cedric Nugteren |
2016-05-15 | Added support for staggered/shuffled offsets for GEMM to improve performance ... | cnugteren |
2016-02-08 | Separated the GEMM kernel in two parts to reduce string length for MSVC | Cedric Nugteren |