summaryrefslogtreecommitdiff
path: root/CHANGELOG
AgeCommit message (Collapse)Author
2017-04-16Finalized support for performance testing against cuBLASCedric Nugteren
2017-04-10Updated the changelog with the Apple CPU overrideCedric Nugteren
2017-03-26Replaced the R graph scripts with Python/Matplotlib benchmark scriptsCedric Nugteren
2017-03-11Added initial naive version of the batched GEMM routine based on the direct ↵Cedric Nugteren
GEMM kernel
2017-03-10Added proper testing of the alpha parameter; finalized the batched AXPY ↵Cedric Nugteren
implementation
2017-02-27Added L2 error computation and checking for half-precision testsCedric Nugteren
2017-02-27Fixed half-precision bugs in HTBMV/HTPMV/HTRMV/HSYR2K/HTRMM related to ↵Cedric Nugteren
incorrect constants
2017-02-26Merge branch 'development' into triangular_solversCedric Nugteren
2017-02-25Added documentation for the TRSV and TRSM routinesCedric Nugteren
2017-02-18Added documentation for the OverrideParameters functionCedric Nugteren
2017-01-24Updated the changelog for PR131 and PR132Cedric Nugteren
2017-01-07Updated the link to cl.hpp in the Khronos registry for the samplesCedric Nugteren
2017-01-07Always enables cl_khr_fp64 when running double-precision, not just for ↵Cedric Nugteren
OpenCL 1.1 or lower
2017-01-03Added tuning results for the AMD Turks GPU and the Intel Core i7-2670QM CPUCedric Nugteren
2016-12-18Fixed a bug when using offsets in the direct GEMM kernelsCedric Nugteren
2016-11-27Updated to version 0.10.0Cedric Nugteren
2016-11-27Made it possible to use the command-line environmental variables for each ↵Cedric Nugteren
executable and without re-running CMake
2016-11-24Merge pull request #125 from CNugteren/netlib_blas_apiCedric Nugteren
Netlib CBLAS API for CLBlast
2016-11-20Fixed a bug in the TRMM routine caused by overwriting input data before ↵Cedric Nugteren
consuming everything
2016-10-25Added an example and documentation for the Netlib CBLAS APICedric Nugteren
2016-10-22All enums in the C API are now prefixed with CLBlast to avoid potential name ↵Cedric Nugteren
clashes with other projects
2016-10-22Added documentation for the better exception handlingCedric Nugteren
2016-10-22Fixed a bug in the SYRK/SYR2K/HERK/HER2K routines that would occur with ↵Cedric Nugteren
specific tuning parameters
2016-10-15Added documentation and minor refactoring for the recent support of static ↵Cedric Nugteren
library compilation
2016-10-13Added tuning results for Intel HD Graphics IvyBridge GPUCedric Nugteren
2016-10-10Added support for compiling the library, the client, and the samples under ↵Cedric Nugteren
MSVC 2013
2016-10-06Added first tuning results for the single-kernel direct GEMM implementationCedric Nugteren
2016-09-27Added an option to run tuned kernels multiple times to average execution ↵Cedric Nugteren
times; requires CLTune 2.5.0
2016-09-27Updated to version 8.0 of the CLCudaAPI headerCedric Nugteren
2016-09-27Added more relaxed error checking for the half-precision testsCedric Nugteren
2016-09-22Fixed a bug waiting for an invalid event in case of a non-succesfull CLBlast ↵Cedric Nugteren
call in the tests and samples
2016-09-21It is now possible to set the OpenCL compiler options through an ↵Cedric Nugteren
environmental variable
2016-09-13Updated to version 0.9.0Cedric Nugteren
2016-09-04The GEMM kernel no longer adds beta*C in case beta is zero; this would cause ↵Cedric Nugteren
problems if C contains NaNs
2016-08-22Merge branch 'database_defaults' into developmentCedric Nugteren
2016-08-21Updated the changelog; refactored the database-get-bests code a bitCedric Nugteren
2016-08-20Merge branch 'development' of github.com:CNugteren/CLBlast into developmentCedric Nugteren
Conflicts: README.md
2016-08-20Merge branch 'master' of https://github.com/dvasschemacq/CLBlast into ↵Cedric Nugteren
dvasschemacq-master Conflicts: src/kernels/level1/xaxpy.opencl src/kernels/level2/xgemv.opencl src/kernels/level2/xgemv_fast.opencl src/kernels/level2/xger.opencl src/kernels/level2/xher.opencl src/kernels/level2/xher2.opencl src/kernels/level3/xgemm_part2.opencl
2016-07-28Minor update regarding the previous CMake export/install target changesCedric Nugteren
2016-07-23Fixe a bug in the new XgemvFastRot kernel related to local memory sizeCedric Nugteren
2016-07-10Now passing alpha/beta to the kernel as arguments as before fp16 support; in ↵Cedric Nugteren
case of fp16 arguments are cast on host and in kernel
2016-07-08Cache now compares cl_context instead of a pointer to a context; added ↵Cedric Nugteren
verbose print statements to the cache
2016-07-06Added an option to the performance clients to do a warm-up run before timingCedric Nugteren
2016-07-03Added tuning results for GTX670, GTX750, and GTX1070 (thanks to gcp)Cedric Nugteren
2016-07-02Fixed some memory leaks related to events not properly cleaned-upCedric Nugteren
2016-06-30Added declspec(dllexport) to ClearCache and FillCache, and added ↵Cedric Nugteren
declspec(dllimport) when not building the library
2016-06-29Updated to version 6.0 of the CLCudaAPI headerCedric Nugteren
2016-06-28Prepared the changelog for the next releaseCedric Nugteren
2016-06-28Updated to version 0.8.0Cedric Nugteren
2016-06-27Added Appveyor Windows CI supportCedric Nugteren