summaryrefslogtreecommitdiff
path: root/src/kernels/level3/xgemm_direct_part2.opencl
AgeCommit message (Collapse)Author
2016-12-18Fixed a bug when using offsets in the direct GEMM kernelsCedric Nugteren
2016-10-03Fixed a const-correctness issue with complex conjugation in the GEMM direct ↵Cedric Nugteren
kernel
2016-10-03Added functions to load from off-chip to local memory without vector loads ↵Cedric Nugteren
for the GEMM direct kernels
2016-10-03Re-organised GEMM direct kernel and added faster fall-back version for ↵Cedric Nugteren
incomplete rectangles
2016-10-02Specialised the GEMM direct kernel in four ways for ↵Cedric Nugteren
transposing/non-transposing: NN, NT, TN, TT
2016-10-02Split the GEMM direct kernel into two files; set the default tuning target ↵Cedric Nugteren
to 256-256-256