Age | Commit message (Expand) | Author |
2023-01-17 | Updated according to feedback from CNugteren | Angus, Alexander |
2023-01-03 | implemented changes to boost Adreno performance according to https://jira-dc.... | Angus, Alexander |
2018-05-31 | Some potential fixes for error -54 when launching TRSV and TRSM kernels | Cedric Nugteren |
2018-03-15 | Fixed a failing TRSV test using a CPU with Apple OpenCL | Cedric Nugteren |
2017-12-09 | Completed kernel modifications for pre-processor of all other kernels | Cedric Nugteren |
2017-12-05 | Improved array-to-register promotion, now handling function calls as well | Cedric Nugteren |
2017-11-29 | Reformatted unrollable kernel loops and added the new promote_to_registers pr... | Cedric Nugteren |
2017-10-17 | CUDA kernel compilation fixes | Cedric Nugteren |
2017-07-08 | Made the inline keyword in kernels optional currently only enabled for NVIDIA... | Cedric Nugteren |
2017-02-05 | Fixed complex version of the TRSV kernel | Cedric Nugteren |
2017-02-04 | Improved substition kernels a bit; added complex support | Cedric Nugteren |
2017-02-04 | Completed a first STRSV implementation | Cedric Nugteren |
2017-01-29 | Added first (incomplete) version of TRSV routine | Cedric Nugteren |
2016-08-20 | Merge branch 'master' of https://github.com/dvasschemacq/CLBlast into dvassch... | Cedric Nugteren |
2016-08-18 | Adapt opencl files for 1.1 OpenCL | D. Van Assche |
2016-07-23 | Fixe a bug in the new XgemvFastRot kernel related to local memory size | Cedric Nugteren |
2016-07-23 | Further improvements to the XgemvFastRot kernel, properly enables coalescing now | Cedric Nugteren |
2016-07-23 | Improved the XgemvFastRot kernel by tiled loading of the input matrix A, enab... | Cedric Nugteren |
2016-07-10 | Now passing alpha/beta to the kernel as arguments as before fp16 support; in ... | Cedric Nugteren |
2016-05-22 | Prepared the GER kernels and tuner for half-precision support | Cedric Nugteren |
2016-05-22 | Prepared the GEMV kernels and tuner for half-precision support | Cedric Nugteren |
2016-03-06 | Added preliminary support for xHPR2 and xSPR2 routines | Cedric Nugteren |
2016-03-02 | Added preliminary support for xHER2 and xSYR2 routines | Cedric Nugteren |
2016-02-28 | Fixed a couple of correctness bugs in the Xher kernels | Cedric Nugteren |
2016-02-28 | Added support for xHER, xHPR, xSYR, and xSPR routines | Cedric Nugteren |
2016-02-20 | Added support for xGERU and xGERC routines | Cedric Nugteren |
2016-02-20 | Added XGER routine, kernel, and tuner | Cedric Nugteren |
2016-02-08 | Split-up the XGEMV kernel in two parts | Cedric Nugteren |
2016-02-06 | Reduced unrolling factor in xgemv kernel to reduce compilation times | CNugteren |
2015-09-26 | Added TRMV/TBMV/TPMV routines | CNugteren |
2015-09-19 | Added SBMV and SPMV routines | CNugteren |
2015-09-19 | Added the HPMV routine | CNugteren |
2015-09-19 | Added the HBMV routine | CNugteren |
2015-09-18 | Improved the organization and performance of level 2 routines | CNugteren |
2015-09-18 | Added first version of banded matrix-vector multiplication | CNugteren |