Age | Commit message (Collapse) | Author | |
---|---|---|---|
2017-12-05 | Improved array-to-register promotion, now handling function calls as well | Cedric Nugteren | |
2017-11-29 | Reformatted unrollable kernel loops and added the new promote_to_registers ↵ | Cedric Nugteren | |
pragma for several kernels | |||
2017-07-08 | Made the inline keyword in kernels optional currently only enabled for ↵ | Cedric Nugteren | |
NVIDIA and ARM GPUs | |||
2016-08-20 | Merge branch 'master' of https://github.com/dvasschemacq/CLBlast into ↵ | Cedric Nugteren | |
dvasschemacq-master Conflicts: src/kernels/level1/xaxpy.opencl src/kernels/level2/xgemv.opencl src/kernels/level2/xgemv_fast.opencl src/kernels/level2/xger.opencl src/kernels/level2/xher.opencl src/kernels/level2/xher2.opencl src/kernels/level3/xgemm_part2.opencl | |||
2016-08-18 | Adapt opencl files for 1.1 OpenCL | D. Van Assche | |
In OpenCL 1.1 __kernel has to be before __attribute__, at least with Vivante compiler. | |||
2016-07-23 | Fixe a bug in the new XgemvFastRot kernel related to local memory size | Cedric Nugteren | |
2016-07-23 | Further improvements to the XgemvFastRot kernel, properly enables coalescing now | Cedric Nugteren | |
2016-07-23 | Improved the XgemvFastRot kernel by tiled loading of the input matrix A, ↵ | Cedric Nugteren | |
enabling better memory performance | |||
2016-07-10 | Now passing alpha/beta to the kernel as arguments as before fp16 support; in ↵ | Cedric Nugteren | |
case of fp16 arguments are cast on host and in kernel | |||
2016-05-22 | Prepared the GEMV kernels and tuner for half-precision support | Cedric Nugteren | |
2016-02-08 | Split-up the XGEMV kernel in two parts | Cedric Nugteren | |