summaryrefslogtreecommitdiff
path: root/src/kernels/level1
AgeCommit message (Expand)Author
2023-05-07AMAX/AMIN integer testing and bug fixes (#457)Cedric Nugteren
2023-01-17Updated according to feedback from CNugterenAngus, Alexander
2023-01-03implemented changes to boost Adreno performance according to https://jira-dc....Angus, Alexander
2022-04-22sum fixJustin Graham
2020-03-08Made it more likely (but no guarantees) for amax/amin to return the first indexCedric Nugteren
2019-09-04Fix out-of-bounds read/write in XhadFasteretomzak
2019-05-19Fixed a bug in the absolute-min index kernelCedric Nugteren
2018-10-17Fixed a bug with the pre-processing and the AXPY kernelCedric Nugteren
2018-10-15Fixed a bug in the XaxpyFaster kernel for specific parametersCedric Nugteren
2018-02-02Implemented the XHAD Hadamard product routineCedric Nugteren
2017-12-09Completed kernel modifications for pre-processor of all other kernelsCedric Nugteren
2017-12-03Added GEMM (direct and in-direct) to the pre-processor testing; modified the ...Cedric Nugteren
2017-11-29Reformatted unrollable kernel loops and added the new promote_to_registers pr...Cedric Nugteren
2017-11-25Implemented first simple pre-processor: defines parser and loop unrolling bas...Cedric Nugteren
2017-07-08Made the inline keyword in kernels optional currently only enabled for NVIDIA...Cedric Nugteren
2017-05-12Added the IxAMIN routines: absolute minimum version of IxAMAXCedric Nugteren
2017-04-14Added a new Xaxpy kernel in between the regular and fast version inCedric Nugteren
2017-03-10Added proper testing of the alpha parameter; finalized the batched AXPY imple...Cedric Nugteren
2017-03-08Implemented a batched version of the AXPY kernelCedric Nugteren
2017-03-08Make batched routines based on offsets instead of a vector of cl_mem objects ...Cedric Nugteren
2016-08-20Merge branch 'master' of https://github.com/dvasschemacq/CLBlast into dvassch...Cedric Nugteren
2016-08-18Adapt opencl files for 1.1 OpenCLD. Van Assche
2016-07-10Now passing alpha/beta to the kernel as arguments as before fp16 support; in ...Cedric Nugteren
2016-05-14Set kernel arguments for AXPY as constant memory buffers, making it possible ...Cedric Nugteren
2016-05-13Initial experimental version of the half-precision HAXPY routineCedric Nugteren
2016-05-08Fixed errors in xAXPY and xSCAL tests on AMD hardwarecnugteren
2016-04-30Added non-aboslute minimum counter-part IxMIN of the BLAS routine IxAMAXCedric Nugteren
2016-04-27Added non-absolute counter-parts xSUM and IxMAX of the BLAS routines xASUM an...Cedric Nugteren
2016-04-20Added support for the iSAMAX/iDAMAX/iCAMAX/iZAMAX routinescnugteren
2016-04-14Added support for the SASUM/DASUM/ScASUM/DzASUM routinescnugteren
2016-03-30Fixed the nrm2 kernel for complex data-typescnugteren
2016-03-28Added preliminary support for the xNRM2 routinesCedric Nugteren
2015-09-14Added xDOT/xDOTU/xDOTC dot-product routinesCNugteren
2015-08-22Added the XSWAP, XSCAL and XCOPY level-1 routinesCNugteren
2015-08-22Re-organized level1 xaxpy kernelCNugteren