summaryrefslogtreecommitdiff
path: root/CHANGELOG
AgeCommit message (Collapse)Author
2016-12-18Fixed a bug when using offsets in the direct GEMM kernelsCedric Nugteren
2016-11-27Updated to version 0.10.0Cedric Nugteren
2016-11-27Made it possible to use the command-line environmental variables for each ↵Cedric Nugteren
executable and without re-running CMake
2016-11-24Merge pull request #125 from CNugteren/netlib_blas_apiCedric Nugteren
Netlib CBLAS API for CLBlast
2016-11-20Fixed a bug in the TRMM routine caused by overwriting input data before ↵Cedric Nugteren
consuming everything
2016-10-25Added an example and documentation for the Netlib CBLAS APICedric Nugteren
2016-10-22All enums in the C API are now prefixed with CLBlast to avoid potential name ↵Cedric Nugteren
clashes with other projects
2016-10-22Added documentation for the better exception handlingCedric Nugteren
2016-10-22Fixed a bug in the SYRK/SYR2K/HERK/HER2K routines that would occur with ↵Cedric Nugteren
specific tuning parameters
2016-10-15Added documentation and minor refactoring for the recent support of static ↵Cedric Nugteren
library compilation
2016-10-13Added tuning results for Intel HD Graphics IvyBridge GPUCedric Nugteren
2016-10-10Added support for compiling the library, the client, and the samples under ↵Cedric Nugteren
MSVC 2013
2016-10-06Added first tuning results for the single-kernel direct GEMM implementationCedric Nugteren
2016-09-27Added an option to run tuned kernels multiple times to average execution ↵Cedric Nugteren
times; requires CLTune 2.5.0
2016-09-27Updated to version 8.0 of the CLCudaAPI headerCedric Nugteren
2016-09-27Added more relaxed error checking for the half-precision testsCedric Nugteren
2016-09-22Fixed a bug waiting for an invalid event in case of a non-succesfull CLBlast ↵Cedric Nugteren
call in the tests and samples
2016-09-21It is now possible to set the OpenCL compiler options through an ↵Cedric Nugteren
environmental variable
2016-09-13Updated to version 0.9.0Cedric Nugteren
2016-09-04The GEMM kernel no longer adds beta*C in case beta is zero; this would cause ↵Cedric Nugteren
problems if C contains NaNs
2016-08-22Merge branch 'database_defaults' into developmentCedric Nugteren
2016-08-21Updated the changelog; refactored the database-get-bests code a bitCedric Nugteren
2016-08-20Merge branch 'development' of github.com:CNugteren/CLBlast into developmentCedric Nugteren
Conflicts: README.md
2016-08-20Merge branch 'master' of https://github.com/dvasschemacq/CLBlast into ↵Cedric Nugteren
dvasschemacq-master Conflicts: src/kernels/level1/xaxpy.opencl src/kernels/level2/xgemv.opencl src/kernels/level2/xgemv_fast.opencl src/kernels/level2/xger.opencl src/kernels/level2/xher.opencl src/kernels/level2/xher2.opencl src/kernels/level3/xgemm_part2.opencl
2016-07-28Minor update regarding the previous CMake export/install target changesCedric Nugteren
2016-07-23Fixe a bug in the new XgemvFastRot kernel related to local memory sizeCedric Nugteren
2016-07-10Now passing alpha/beta to the kernel as arguments as before fp16 support; in ↵Cedric Nugteren
case of fp16 arguments are cast on host and in kernel
2016-07-08Cache now compares cl_context instead of a pointer to a context; added ↵Cedric Nugteren
verbose print statements to the cache
2016-07-06Added an option to the performance clients to do a warm-up run before timingCedric Nugteren
2016-07-03Added tuning results for GTX670, GTX750, and GTX1070 (thanks to gcp)Cedric Nugteren
2016-07-02Fixed some memory leaks related to events not properly cleaned-upCedric Nugteren
2016-06-30Added declspec(dllexport) to ClearCache and FillCache, and added ↵Cedric Nugteren
declspec(dllimport) when not building the library
2016-06-29Updated to version 6.0 of the CLCudaAPI headerCedric Nugteren
2016-06-28Prepared the changelog for the next releaseCedric Nugteren
2016-06-28Updated to version 0.8.0Cedric Nugteren
2016-06-27Added Appveyor Windows CI supportCedric Nugteren
2016-06-19Renamed all C++ source files to .cpp to match the .hpp extension betterCedric Nugteren
2016-06-16Added XOMATCOPY routines to perform out-of-place matrix scaling, copying, ↵Cedric Nugteren
and/or transposing
2016-06-13Improved API documentation and added documentation for level-2 and level-3 ↵Cedric Nugteren
routines
2016-06-01Added tuning parameters for 'GRID K520' and 'HD Graphics Skylake ULT GT2'Cedric Nugteren
2016-05-31Made use of CMake's built-in unit testing, allowing all tests to be run ↵Cedric Nugteren
using 'make test'
2016-05-30Increased the verbosity of the -verbose option in the correctness testsCedric Nugteren
2016-05-30Separated the performance tests (clients) from the correctness tests in CMakeCedric Nugteren
2016-05-30Merge branch 'half_precision' into developmentCedric Nugteren
2016-05-25Added level-3 half-precision routines HGEMM/HSYMM/HSYRK/HSYR2K/HTRMMCedric Nugteren
2016-05-22Added level-2 half-precision routines HGER/HSYR/HSPR/HSYR2/HSPR2Cedric Nugteren
2016-05-22Added level-2 half-precision routines ↵Cedric Nugteren
HGEMV/HGBMV/HHEMV/HHBMV/HHPMV/HSYMV/HSBMV/HSPMV/HTRMV/HTBMV/HTPMV
2016-05-22Added level-1 half-precision routines ↵Cedric Nugteren
HSWAP/HSCAL/HCOPY/HAXPY/HDOT/HNRM2/HASUM/HSUM/iHAMAX/iHMAX/iHMIN
2016-05-18Merged in latest changes from 0.7.1 releaseCedric Nugteren
2016-05-18Prepared the changelog for the next releaseCedric Nugteren