Age | Commit message (Collapse) | Author | |
---|---|---|---|
2017-12-05 | Improved array-to-register promotion, now handling function calls as well | Cedric Nugteren | |
2017-11-29 | Reformatted unrollable kernel loops and added the new promote_to_registers ↵ | Cedric Nugteren | |
pragma for several kernels | |||
2017-07-08 | Made the inline keyword in kernels optional currently only enabled for ↵ | Cedric Nugteren | |
NVIDIA and ARM GPUs | |||
2016-08-20 | Merge branch 'master' of https://github.com/dvasschemacq/CLBlast into ↵ | Cedric Nugteren | |
dvasschemacq-master Conflicts: src/kernels/level1/xaxpy.opencl src/kernels/level2/xgemv.opencl src/kernels/level2/xgemv_fast.opencl src/kernels/level2/xger.opencl src/kernels/level2/xher.opencl src/kernels/level2/xher2.opencl src/kernels/level3/xgemm_part2.opencl | |||
2016-08-18 | Adapt opencl files for 1.1 OpenCL | D. Van Assche | |
In OpenCL 1.1 __kernel has to be before __attribute__, at least with Vivante compiler. | |||
2016-07-10 | Now passing alpha/beta to the kernel as arguments as before fp16 support; in ↵ | Cedric Nugteren | |
case of fp16 arguments are cast on host and in kernel | |||
2016-05-22 | Prepared the GEMV kernels and tuner for half-precision support | Cedric Nugteren | |
2016-02-08 | Split-up the XGEMV kernel in two parts | Cedric Nugteren | |
2016-02-06 | Reduced unrolling factor in xgemv kernel to reduce compilation times | CNugteren | |
2015-09-26 | Added TRMV/TBMV/TPMV routines | CNugteren | |
2015-09-19 | Added SBMV and SPMV routines | CNugteren | |
2015-09-19 | Added the HPMV routine | CNugteren | |
2015-09-19 | Added the HBMV routine | CNugteren | |
2015-09-18 | Improved the organization and performance of level 2 routines | CNugteren | |
2015-09-18 | Added first version of banded matrix-vector multiplication | CNugteren | |