summaryrefslogtreecommitdiff
path: root/src/kernels/level3/xgemm_part3.opencl
AgeCommit message (Collapse)Author
2017-12-09Fixed defines parsing and substituting in pre-processor; fixed some variable ↵Cedric Nugteren
names in kernels
2017-12-07Added register promotion to the main GEMM kernelCedric Nugteren
2017-12-03Added GEMM (direct and in-direct) to the pre-processor testing; modified the ↵Cedric Nugteren
loops in kernel accordingly
2017-10-14Make local memory pointers a define in OpenCL; some fixes to the recently ↵Cedric Nugteren
changed transpose kernel code
2017-10-03Gemm in-direct implementation now uses only 1 larger instead of max 3 ↵Cedric Nugteren
optional temporary buffers
2017-07-08Made the inline keyword in kernels optional currently only enabled for ↵Cedric Nugteren
NVIDIA and ARM GPUs
2016-10-22Fixed a bug in the SYRK/SYR2K/HERK/HER2K routines that would occur with ↵Cedric Nugteren
specific tuning parameters (2)
2016-10-22Fixed a bug in the SYRK/SYR2K/HERK/HER2K routines that would occur with ↵Cedric Nugteren
specific tuning parameters
2016-09-12Split the XGEMM kernel further up: now in 3 parts. This is done because MSVC ↵Cedric Nugteren
can't handle long strings