Age | Commit message (Collapse) | Author | |
---|---|---|---|
2017-03-08 | Implemented a batched version of the AXPY kernel | Cedric Nugteren | |
2017-03-08 | Make batched routines based on offsets instead of a vector of cl_mem objects ↵ | Cedric Nugteren | |
- undoing many earlier changes | |||
2016-08-20 | Merge branch 'master' of https://github.com/dvasschemacq/CLBlast into ↵ | Cedric Nugteren | |
dvasschemacq-master Conflicts: src/kernels/level1/xaxpy.opencl src/kernels/level2/xgemv.opencl src/kernels/level2/xgemv_fast.opencl src/kernels/level2/xger.opencl src/kernels/level2/xher.opencl src/kernels/level2/xher2.opencl src/kernels/level3/xgemm_part2.opencl | |||
2016-08-18 | Adapt opencl files for 1.1 OpenCL | D. Van Assche | |
In OpenCL 1.1 __kernel has to be before __attribute__, at least with Vivante compiler. | |||
2016-07-10 | Now passing alpha/beta to the kernel as arguments as before fp16 support; in ↵ | Cedric Nugteren | |
case of fp16 arguments are cast on host and in kernel | |||
2016-05-14 | Set kernel arguments for AXPY as constant memory buffers, making it possible ↵ | Cedric Nugteren | |
to transfer half-precision values as well | |||
2016-05-13 | Initial experimental version of the half-precision HAXPY routine | Cedric Nugteren | |
2016-05-08 | Fixed errors in xAXPY and xSCAL tests on AMD hardware | cnugteren | |
2015-08-22 | Added the XSWAP, XSCAL and XCOPY level-1 routines | CNugteren | |
2015-08-22 | Re-organized level1 xaxpy kernel | CNugteren | |