Age | Commit message (Collapse) | Author | |
---|---|---|---|
2017-12-03 | Reformated transpose kernels for the pre-processor; extended the amount of tests | Cedric Nugteren | |
2017-10-14 | Make local memory pointers a define in OpenCL; some fixes to the recently ↵ | Cedric Nugteren | |
changed transpose kernel code | |||
2017-07-08 | Made the inline keyword in kernels optional currently only enabled for ↵ | Cedric Nugteren | |
NVIDIA and ARM GPUs | |||
2017-03-19 | Added batched versions of the pad/copy/transpose kernels | Cedric Nugteren | |
2016-08-20 | Merge branch 'master' of https://github.com/dvasschemacq/CLBlast into ↵ | Cedric Nugteren | |
dvasschemacq-master Conflicts: src/kernels/level1/xaxpy.opencl src/kernels/level2/xgemv.opencl src/kernels/level2/xgemv_fast.opencl src/kernels/level2/xger.opencl src/kernels/level2/xher.opencl src/kernels/level2/xher2.opencl src/kernels/level3/xgemm_part2.opencl | |||
2016-08-18 | Adapt opencl files for 1.1 OpenCL | D. Van Assche | |
In OpenCL 1.1 __kernel has to be before __attribute__, at least with Vivante compiler. | |||
2016-07-10 | Now passing alpha/beta to the kernel as arguments as before fp16 support; in ↵ | Cedric Nugteren | |
case of fp16 arguments are cast on host and in kernel | |||
2016-06-16 | Added XOMATCOPY routines to perform out-of-place matrix scaling, copying, ↵ | Cedric Nugteren | |
and/or transposing | |||
2016-06-14 | Re-organised the level-3 supporting kernels (copy, pad, transpose, convert) ↵ | Cedric Nugteren | |
and renamed files and functions appropriately |