summaryrefslogtreecommitdiff
path: root/src
AgeCommit message (Expand)Author
2016-05-24Added proper argument handling and displaying for half-precision data-typesCedric Nugteren
2016-05-22Added level-2 half-precision routines HGER/HSYR/HSPR/HSYR2/HSPR2Cedric Nugteren
2016-05-22Fixed tuning results for half-precision; added first results for the xGER ker...Cedric Nugteren
2016-05-22Prepared the GER kernels and tuner for half-precision supportCedric Nugteren
2016-05-22Added level-2 half-precision routines HGEMV/HGBMV/HHEMV/HHBMV/HHPMV/HSYMV/HSB...Cedric Nugteren
2016-05-22Added first tuning results for the half-precision xGEMV kernelsCedric Nugteren
2016-05-22Prepared the GEMV kernels and tuner for half-precision supportCedric Nugteren
2016-05-22Added level-1 half-precision routines HSWAP/HSCAL/HCOPY/HAXPY/HDOT/HNRM2/HASU...Cedric Nugteren
2016-05-22Added first tuning results for the half-precision xDOT kernelsCedric Nugteren
2016-05-22Added half-precision support for all level 1 routinesCedric Nugteren
2016-05-18Merged in latest changes from 0.7.1 releaseCedric Nugteren
2016-05-16Added half precision tuning results for supporting kernels (pad, copy, transp...Cedric Nugteren
2016-05-16Prepared GEMM and supporting kernels and tuners for half-precision supportCedric Nugteren
2016-05-15Added header with conversions from and to half-precision floating-pointCedric Nugteren
2016-05-14Set kernel arguments for AXPY as constant memory buffers, making it possible ...Cedric Nugteren
2016-05-13Initial experimental version of the half-precision HAXPY routineCedric Nugteren
2016-05-12Initial changes in preparation for half-precision fp16 supportCedric Nugteren
2016-05-08Fixed errors in xAXPY and xSCAL tests on AMD hardwarecnugteren
2016-05-02Fixed the calculation of the required buffer sizes in case of subvectors and ...Cedric Nugteren
2016-05-01Made the default xDOT tuning size smallerCedric Nugteren
2016-05-01Changed the index buffer of IxAMAX routines to unsigned int for proper buffer...Cedric Nugteren
2016-05-01Added a program cache (per-context) next to the per-device binary cacheCedric Nugteren
2016-04-30Added non-aboslute minimum counter-part IxMIN of the BLAS routine IxAMAXCedric Nugteren
2016-04-29Added FillCache: a function to pre-compile all kernels for a specific deviceCedric Nugteren
2016-04-28Fixed the cache to store binaries instead of OpenCL programsCedric Nugteren
2016-04-27Added non-absolute counter-parts xSUM and IxMAX of the BLAS routines xASUM an...Cedric Nugteren
2016-04-27Added prototypes for non-BLAS routines: xSUM and IxMAX (non-absolute counterp...Cedric Nugteren
2016-04-27Moved all cache-related functions to a separate file; added a ClearCompiledPr...Cedric Nugteren
2016-04-20Added support for the iSAMAX/iDAMAX/iCAMAX/iZAMAX routinescnugteren
2016-04-20Added prototype for ixAMAX routinescnugteren
2016-04-14Updated the reduction-kernel tuner to also tune the epiloguecnugteren
2016-04-14Added support for the SASUM/DASUM/ScASUM/DzASUM routinescnugteren
2016-04-13Added prototype for xASUM routinescnugteren
2016-04-09Events are now properly implemented using event waiting list and asking the u...cnugteren
2016-04-04Removed redundant queue synchronisation statementscnugteren
2016-04-01Added a wrapper for CBLAS libraries for performance/correctness testingcnugteren
2016-03-30Merge branch 'level1_routines' into developmentcnugteren
2016-03-30Fixed the nrm2 kernel for complex data-typescnugteren
2016-03-30Added prototypes for the xROTM and xROTMG routinesCedric Nugteren
2016-03-30Added prototypes for the xROT and xROTG functionsCedric Nugteren
2016-03-30Fixed properly passing of OpenCL events to CLBlast functionsCedric Nugteren
2016-03-28Added preliminary support for the xNRM2 routinesCedric Nugteren
2016-03-25Added prototypes for ScNRM2/DzNRM2 routinesCedric Nugteren
2016-03-25Added prototypes for SNRM2/DNRM2 routinesCedric Nugteren
2016-03-23Fixed the C-api export to be able to properly build a DLL on WindowsCedric Nugteren
2016-03-19Added __declspec(dllexport) to create a DLL on WindowsCedric Nugteren
2016-03-14Made the library thread-safe by guarding the kernel cache with a mutexCedric Nugteren
2016-03-06Fixed a bug in the GER-family of routines due to incorrect division of the wo...Cedric Nugteren
2016-03-06Added preliminary support for xHPR2 and xSPR2 routinesCedric Nugteren
2016-03-02Added preliminary support for xHER2 and xSYR2 routinesCedric Nugteren