summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2016-03-19Added __declspec(dllexport) to create a DLL on WindowsCedric Nugteren
2016-03-14Made the library thread-safe by guarding the kernel cache with a mutexCedric Nugteren
2016-03-13Prepared the changelog for the next releaseCedric Nugteren
2016-03-13Updated to version 0.6.0Cedric Nugteren
2016-03-13Updated Travis to reflect the changes in the Khronos websiteCedric Nugteren
2016-03-13Updated the README fileCedric Nugteren
2016-03-13Updated Travis script to take into account the missing OpenCL packagesCedric Nugteren
2016-03-13Updated Travis script to fix the fglrx=2:8.960-0ubuntu1 issueCedric Nugteren
2016-03-12Added tuning results for the newest xGER family kernelsCedric Nugteren
2016-03-12Added performance graphs for Intel Iris and Radeon M370XCedric Nugteren
2016-03-12Added tuning results for the ARM Mali-T628 GPUCedric Nugteren
2016-03-06Fixed a bug in the GER-family of routines due to incorrect division of the ↵Cedric Nugteren
workgroup size
2016-03-06Made testing against clBLAS in the client binaries truely optional (was ↵Cedric Nugteren
partly implemented before)
2016-03-06Adjusted the correctness-test error marginsCedric Nugteren
2016-03-06Merge branch 'rank2_update_routines' into developmentCedric Nugteren
2016-03-06Added preliminary support for xHPR2 and xSPR2 routinesCedric Nugteren
2016-03-02Added preliminary support for xHER2 and xSYR2 routinesCedric Nugteren
2016-02-28Added tuning results for Intel Iris Pro and AMD R9 M370XCedric Nugteren
2016-02-28Updated the changelog with newly supported level-2 routinesCedric Nugteren
2016-02-28Merge branch 'ger_routines' into developmentCedric Nugteren
2016-02-28Fixed a couple of correctness bugs in the Xher kernelsCedric Nugteren
2016-02-28Added support for xHER, xHPR, xSYR, and xSPR routinesCedric Nugteren
2016-02-28Fixed a compilation issue under AppleClangCedric Nugteren
2016-02-20Set a proper default precision for the CLBlast clientsCedric Nugteren
2016-02-20Added support for xGERU and xGERC routinesCedric Nugteren
2016-02-20Added XGER routine, kernel, and tunerCedric Nugteren
2016-02-10Updated the changelogCedric Nugteren
2016-02-08Fixed warnings under MSVCCNugteren
2016-02-08Separated the GEMM kernel in two parts to reduce string length for MSVCCedric Nugteren
2016-02-08Split-up the XGEMV kernel in two partsCedric Nugteren
2016-02-07Added tuning parameters for various devices using the new database scriptCedric Nugteren
2016-02-07Various fixes to the database scriptCedric Nugteren
2016-02-07Added dictionary with short and long OpenCL vendor names to fix issues with ↵Cedric Nugteren
Intel having multiple names
2016-02-07Made the tuning database an optional external downloadCedric Nugteren
2016-02-06Made the database script compatible with Python 3CNugteren
2016-02-06Reduced the maximum workgroup-size for GEMV kernels furtherCNugteren
2016-02-06Changed the order of tuners in the alltuners targetCedric Nugteren
2016-02-06Reduced unrolling factor in xgemv kernel to reduce compilation timesCNugteren
2016-02-06Fixed a linker error in the performance client under GCCCNugteren
2016-01-30Fixes for compilation under Visual StudioCNugteren
2016-01-30Prepared for MSVC supportCedric Nugteren
2016-01-30Fixed a bug in the graph scripts (thanks to Victor Pakhomov)Cedric Nugteren
2016-01-30Updated to version 4.0 of the CLCudaAPI headerCedric Nugteren
2016-01-30Merge branch 'tuning_database' into developmentCedric Nugteren
is merge is necessary,
2016-01-30Added first auto-generated database headers from the Python database; only ↵Cedric Nugteren
K40 and Iris supported now
2016-01-24Minor improvements to the database script, including proper file pathsCedric Nugteren
2016-01-24Added Python function to compute defaults for a particular device/vendor ↵Cedric Nugteren
combination
2016-01-23Updated FindOpenCL for Intel Linux OpenCL pathsCedric Nugteren
2015-10-28Added tuning data for Tesla K40CNugteren
2015-10-28Now sets local memory size in xgemv tuner properlyCNugteren