summaryrefslogtreecommitdiff
path: root/src/kernels/level2/xgemv.opencl
AgeCommit message (Collapse)Author
2017-12-05Improved array-to-register promotion, now handling function calls as wellCedric Nugteren
2017-11-29Reformatted unrollable kernel loops and added the new promote_to_registers ↵Cedric Nugteren
pragma for several kernels
2017-07-08Made the inline keyword in kernels optional currently only enabled for ↵Cedric Nugteren
NVIDIA and ARM GPUs
2016-08-20Merge branch 'master' of https://github.com/dvasschemacq/CLBlast into ↵Cedric Nugteren
dvasschemacq-master Conflicts: src/kernels/level1/xaxpy.opencl src/kernels/level2/xgemv.opencl src/kernels/level2/xgemv_fast.opencl src/kernels/level2/xger.opencl src/kernels/level2/xher.opencl src/kernels/level2/xher2.opencl src/kernels/level3/xgemm_part2.opencl
2016-08-18Adapt opencl files for 1.1 OpenCLD. Van Assche
In OpenCL 1.1 __kernel has to be before __attribute__, at least with Vivante compiler.
2016-07-10Now passing alpha/beta to the kernel as arguments as before fp16 support; in ↵Cedric Nugteren
case of fp16 arguments are cast on host and in kernel
2016-05-22Prepared the GEMV kernels and tuner for half-precision supportCedric Nugteren
2016-02-08Split-up the XGEMV kernel in two partsCedric Nugteren
2016-02-06Reduced unrolling factor in xgemv kernel to reduce compilation timesCNugteren
2015-09-26Added TRMV/TBMV/TPMV routinesCNugteren
2015-09-19Added SBMV and SPMV routinesCNugteren
2015-09-19Added the HPMV routineCNugteren
2015-09-19Added the HBMV routineCNugteren
2015-09-18Improved the organization and performance of level 2 routinesCNugteren
2015-09-18Added first version of banded matrix-vector multiplicationCNugteren