summaryrefslogtreecommitdiff
path: root/include/internal
AgeCommit message (Expand)Author
2016-05-15Added new tuning results for SGEMM and updated the performance graph for the ...cnugteren
2016-05-15Added support for staggered/shuffled offsets for GEMM to improve performance ...cnugteren
2016-05-02Added tuning results for AMD Hawaii (R9 290X)Cedric Nugteren
2016-05-01Added tuning results for AMD Pitcairn (R9 270X)Cedric Nugteren
2016-05-01Updated tuning database for reduction/dot kernels based on the new tuner; par...Cedric Nugteren
2016-05-01Changed the index buffer of IxAMAX routines to unsigned int for proper buffer...Cedric Nugteren
2016-05-01Added a program cache (per-context) next to the per-device binary cacheCedric Nugteren
2016-04-30Added non-aboslute minimum counter-part IxMIN of the BLAS routine IxAMAXCedric Nugteren
2016-04-28Fixed the cache to store binaries instead of OpenCL programsCedric Nugteren
2016-04-27Added non-absolute counter-parts xSUM and IxMAX of the BLAS routines xASUM an...Cedric Nugteren
2016-04-27Moved all cache-related functions to a separate file; added a ClearCompiledPr...Cedric Nugteren
2016-04-27Added a '-verbose' option to the test binaries to report errors in more detai...Cedric Nugteren
2016-04-20Added support for the iSAMAX/iDAMAX/iCAMAX/iZAMAX routinescnugteren
2016-04-14Added support for the SASUM/DASUM/ScASUM/DzASUM routinescnugteren
2016-04-11Fixed the way the defaults are calculated in the database; added warning for ...cnugteren
2016-04-09Events are now properly implemented using event waiting list and asking the u...cnugteren
2016-04-02Added support for testing (performance and correctness) against a CPU BLAS li...cnugteren
2016-03-30Merge branch 'level1_routines' into developmentcnugteren
2016-03-30Made event an optional argument in the CLBlast C++ APICedric Nugteren
2016-03-30Added missing newline to the end of the public API fileCedric Nugteren
2016-03-30Fixed properly passing of OpenCL events to CLBlast functionsCedric Nugteren
2016-03-28Added preliminary support for the xNRM2 routinesCedric Nugteren
2016-03-23Fixed the C-api export to be able to properly build a DLL on WindowsCedric Nugteren
2016-03-19Added __declspec(dllexport) to create a DLL on WindowsCedric Nugteren
2016-03-14Made the library thread-safe by guarding the kernel cache with a mutexCedric Nugteren
2016-03-12Added tuning results for the newest xGER family kernelsCedric Nugteren
2016-03-12Added tuning results for the ARM Mali-T628 GPUCedric Nugteren
2016-03-06Added preliminary support for xHPR2 and xSPR2 routinesCedric Nugteren
2016-03-02Added preliminary support for xHER2 and xSYR2 routinesCedric Nugteren
2016-02-28Added tuning results for Intel Iris Pro and AMD R9 M370XCedric Nugteren
2016-02-28Added support for xHER, xHPR, xSYR, and xSPR routinesCedric Nugteren
2016-02-28Fixed a compilation issue under AppleClangCedric Nugteren
2016-02-20Set a proper default precision for the CLBlast clientsCedric Nugteren
2016-02-20Added support for xGERU and xGERC routinesCedric Nugteren
2016-02-20Added XGER routine, kernel, and tunerCedric Nugteren
2016-02-07Added tuning parameters for various devices using the new database scriptCedric Nugteren
2016-02-07Added dictionary with short and long OpenCL vendor names to fix issues with I...Cedric Nugteren
2016-02-06Fixed a linker error in the performance client under GCCCNugteren
2016-01-30Updated to version 4.0 of the CLCudaAPI headerCedric Nugteren
2016-01-30Added first auto-generated database headers from the Python database; only K4...Cedric Nugteren
2015-10-23Added alpha and beta to tuner meta-dataCNugteren
2015-10-12Routine names are now all default arguments defined in the headerCNugteren
2015-09-26Added TRMV/TBMV/TPMV routinesCNugteren
2015-09-26Made buffer copying a const-method for the sourceCNugteren
2015-09-19Added SBMV and SPMV routinesCNugteren
2015-09-19Added the HPMV routineCNugteren
2015-09-19Added infrastructure for packed matricesCNugteren
2015-09-19Added the HBMV routineCNugteren
2015-09-18Improved the organization and performance of level 2 routinesCNugteren
2015-09-18Added first version of banded matrix-vector multiplicationCNugteren