index
:
debian-clblast
debian/sid
upstream/latest
Debian package for CLBlast.
gspr@nonempty.org
summary
refs
log
tree
commit
diff
log msg
author
committer
range
path:
root
/
src
/
kernels
/
level3
/
xgemm_part3.opencl
Age
Commit message (
Collapse
)
Author
2017-12-09
Fixed defines parsing and substituting in pre-processor; fixed some variable ↵
Cedric Nugteren
names in kernels
2017-12-07
Added register promotion to the main GEMM kernel
Cedric Nugteren
2017-12-03
Added GEMM (direct and in-direct) to the pre-processor testing; modified the ↵
Cedric Nugteren
loops in kernel accordingly
2017-10-14
Make local memory pointers a define in OpenCL; some fixes to the recently ↵
Cedric Nugteren
changed transpose kernel code
2017-10-03
Gemm in-direct implementation now uses only 1 larger instead of max 3 ↵
Cedric Nugteren
optional temporary buffers
2017-07-08
Made the inline keyword in kernels optional currently only enabled for ↵
Cedric Nugteren
NVIDIA and ARM GPUs
2016-10-22
Fixed a bug in the SYRK/SYR2K/HERK/HER2K routines that would occur with ↵
Cedric Nugteren
specific tuning parameters (2)
2016-10-22
Fixed a bug in the SYRK/SYR2K/HERK/HER2K routines that would occur with ↵
Cedric Nugteren
specific tuning parameters
2016-09-12
Split the XGEMM kernel further up: now in 3 parts. This is done because MSVC ↵
Cedric Nugteren
can't handle long strings