summaryrefslogtreecommitdiff
path: root/src/kernels/level2
AgeCommit message (Expand)Author
2023-01-17Updated according to feedback from CNugterenAngus, Alexander
2023-01-03implemented changes to boost Adreno performance according to https://jira-dc....Angus, Alexander
2018-05-31Some potential fixes for error -54 when launching TRSV and TRSM kernelsCedric Nugteren
2018-03-15Fixed a failing TRSV test using a CPU with Apple OpenCLCedric Nugteren
2017-12-09Completed kernel modifications for pre-processor of all other kernelsCedric Nugteren
2017-12-05Improved array-to-register promotion, now handling function calls as wellCedric Nugteren
2017-11-29Reformatted unrollable kernel loops and added the new promote_to_registers pr...Cedric Nugteren
2017-10-17CUDA kernel compilation fixesCedric Nugteren
2017-07-08Made the inline keyword in kernels optional currently only enabled for NVIDIA...Cedric Nugteren
2017-02-05Fixed complex version of the TRSV kernelCedric Nugteren
2017-02-04Improved substition kernels a bit; added complex supportCedric Nugteren
2017-02-04Completed a first STRSV implementationCedric Nugteren
2017-01-29Added first (incomplete) version of TRSV routineCedric Nugteren
2016-08-20Merge branch 'master' of https://github.com/dvasschemacq/CLBlast into dvassch...Cedric Nugteren
2016-08-18Adapt opencl files for 1.1 OpenCLD. Van Assche
2016-07-23Fixe a bug in the new XgemvFastRot kernel related to local memory sizeCedric Nugteren
2016-07-23Further improvements to the XgemvFastRot kernel, properly enables coalescing nowCedric Nugteren
2016-07-23Improved the XgemvFastRot kernel by tiled loading of the input matrix A, enab...Cedric Nugteren
2016-07-10Now passing alpha/beta to the kernel as arguments as before fp16 support; in ...Cedric Nugteren
2016-05-22Prepared the GER kernels and tuner for half-precision supportCedric Nugteren
2016-05-22Prepared the GEMV kernels and tuner for half-precision supportCedric Nugteren
2016-03-06Added preliminary support for xHPR2 and xSPR2 routinesCedric Nugteren
2016-03-02Added preliminary support for xHER2 and xSYR2 routinesCedric Nugteren
2016-02-28Fixed a couple of correctness bugs in the Xher kernelsCedric Nugteren
2016-02-28Added support for xHER, xHPR, xSYR, and xSPR routinesCedric Nugteren
2016-02-20Added support for xGERU and xGERC routinesCedric Nugteren
2016-02-20Added XGER routine, kernel, and tunerCedric Nugteren
2016-02-08Split-up the XGEMV kernel in two partsCedric Nugteren
2016-02-06Reduced unrolling factor in xgemv kernel to reduce compilation timesCNugteren
2015-09-26Added TRMV/TBMV/TPMV routinesCNugteren
2015-09-19Added SBMV and SPMV routinesCNugteren
2015-09-19Added the HPMV routineCNugteren
2015-09-19Added the HBMV routineCNugteren
2015-09-18Improved the organization and performance of level 2 routinesCNugteren
2015-09-18Added first version of banded matrix-vector multiplicationCNugteren