summaryrefslogtreecommitdiff
path: root/CHANGELOG
AgeCommit message (Collapse)Author
2016-06-16Added XOMATCOPY routines to perform out-of-place matrix scaling, copying, ↵Cedric Nugteren
and/or transposing
2016-06-13Improved API documentation and added documentation for level-2 and level-3 ↵Cedric Nugteren
routines
2016-06-01Added tuning parameters for 'GRID K520' and 'HD Graphics Skylake ULT GT2'Cedric Nugteren
2016-05-31Made use of CMake's built-in unit testing, allowing all tests to be run ↵Cedric Nugteren
using 'make test'
2016-05-30Increased the verbosity of the -verbose option in the correctness testsCedric Nugteren
2016-05-30Separated the performance tests (clients) from the correctness tests in CMakeCedric Nugteren
2016-05-30Merge branch 'half_precision' into developmentCedric Nugteren
2016-05-25Added level-3 half-precision routines HGEMM/HSYMM/HSYRK/HSYR2K/HTRMMCedric Nugteren
2016-05-22Added level-2 half-precision routines HGER/HSYR/HSPR/HSYR2/HSPR2Cedric Nugteren
2016-05-22Added level-2 half-precision routines ↵Cedric Nugteren
HGEMV/HGBMV/HHEMV/HHBMV/HHPMV/HSYMV/HSBMV/HSPMV/HTRMV/HTBMV/HTPMV
2016-05-22Added level-1 half-precision routines ↵Cedric Nugteren
HSWAP/HSCAL/HCOPY/HAXPY/HDOT/HNRM2/HASUM/HSUM/iHAMAX/iHMAX/iHMIN
2016-05-18Merged in latest changes from 0.7.1 releaseCedric Nugteren
2016-05-18Prepared the changelog for the next releaseCedric Nugteren
2016-05-18Updated to version 0.7.1Cedric Nugteren
2016-05-17Made MSVC link the run-time libraries staticallyCedric Nugteren
2016-05-15Added header with conversions from and to half-precision floating-pointCedric Nugteren
2016-05-15Fixed a bug in the xGEMM routine related to the event incorrectly setcnugteren
2016-05-15Added support for staggered/shuffled offsets for GEMM to improve performance ↵cnugteren
for large power-of-2 kernels on AMD GPUs
2016-05-08Prepared the changelog for the next releaseCedric Nugteren
2016-05-08Updated to version 0.7.0Cedric Nugteren
2016-05-08Added preliminary generated API documentationCedric Nugteren
2016-05-07Added an option to the tests to control whether to test against clBLAS or a ↵Cedric Nugteren
CPU BLAS library
2016-05-02Added tuning results for AMD Hawaii (R9 290X)Cedric Nugteren
2016-04-30Added non-aboslute minimum counter-part IxMIN of the BLAS routine IxAMAXCedric Nugteren
2016-04-28Fixed the cache to store binaries instead of OpenCL programsCedric Nugteren
2016-04-27Added non-absolute counter-parts xSUM and IxMAX of the BLAS routines xASUM ↵Cedric Nugteren
and IxAMAX
2016-04-27Moved all cache-related functions to a separate file; added a ↵Cedric Nugteren
ClearCompiledProgramCache function to clear the cache
2016-04-20Added support for the iSAMAX/iDAMAX/iCAMAX/iZAMAX routinescnugteren
2016-04-14Updated the reduction-kernel tuner to also tune the epiloguecnugteren
2016-04-03Updated the documentation in light of the support for a reference CPU BLAS ↵cnugteren
library
2016-03-31Updated the documentationcnugteren
2016-03-23Fixed the C-api export to be able to properly build a DLL on WindowsCedric Nugteren
2016-03-14Made the library thread-safe by guarding the kernel cache with a mutexCedric Nugteren
2016-03-13Prepared the changelog for the next releaseCedric Nugteren
2016-03-13Updated to version 0.6.0Cedric Nugteren
2016-03-06Added preliminary support for xHPR2 and xSPR2 routinesCedric Nugteren
2016-02-28Updated the changelog with newly supported level-2 routinesCedric Nugteren
2016-02-10Updated the changelogCedric Nugteren
2015-10-17Prepared the changelog for the next releaseCNugteren
2015-10-17Updated to version 0.5.0CNugteren
2015-10-13Added guards for routine-specific level-3 pad kernelsCNugteren
2015-09-26Added TRMV/TBMV/TPMV routinesCNugteren
2015-09-19Added SBMV and SPMV routinesCNugteren
2015-09-19Added the HPMV routineCNugteren
2015-09-19Added the HBMV routineCNugteren
2015-09-18Improved the organization and performance of level 2 routinesCNugteren
2015-09-18Added first version of banded matrix-vector multiplicationCNugteren
2015-09-14Added xDOT/xDOTU/xDOTC dot-product routinesCNugteren
2015-08-22Added the XSWAP, XSCAL and XCOPY level-1 routinesCNugteren
2015-08-22Prepared the changelog for the next releaseCNugteren