Age | Commit message (Expand) | Author |
2016-03-06 | Added preliminary support for xHPR2 and xSPR2 routines | Cedric Nugteren |
2016-03-02 | Added preliminary support for xHER2 and xSYR2 routines | Cedric Nugteren |
2016-02-28 | Added tuning results for Intel Iris Pro and AMD R9 M370X | Cedric Nugteren |
2016-02-28 | Added support for xHER, xHPR, xSYR, and xSPR routines | Cedric Nugteren |
2016-02-28 | Fixed a compilation issue under AppleClang | Cedric Nugteren |
2016-02-20 | Set a proper default precision for the CLBlast clients | Cedric Nugteren |
2016-02-20 | Added support for xGERU and xGERC routines | Cedric Nugteren |
2016-02-20 | Added XGER routine, kernel, and tuner | Cedric Nugteren |
2016-02-07 | Added tuning parameters for various devices using the new database script | Cedric Nugteren |
2016-02-07 | Added dictionary with short and long OpenCL vendor names to fix issues with I... | Cedric Nugteren |
2016-02-06 | Fixed a linker error in the performance client under GCC | CNugteren |
2016-01-30 | Updated to version 4.0 of the CLCudaAPI header | Cedric Nugteren |
2016-01-30 | Added first auto-generated database headers from the Python database; only K4... | Cedric Nugteren |
2015-10-23 | Added alpha and beta to tuner meta-data | CNugteren |
2015-10-12 | Routine names are now all default arguments defined in the header | CNugteren |
2015-09-26 | Added TRMV/TBMV/TPMV routines | CNugteren |
2015-09-26 | Made buffer copying a const-method for the source | CNugteren |
2015-09-19 | Added SBMV and SPMV routines | CNugteren |
2015-09-19 | Added the HPMV routine | CNugteren |
2015-09-19 | Added infrastructure for packed matrices | CNugteren |
2015-09-19 | Added the HBMV routine | CNugteren |
2015-09-18 | Improved the organization and performance of level 2 routines | CNugteren |
2015-09-18 | Added first version of banded matrix-vector multiplication | CNugteren |
2015-09-17 | Added interface of all level 2 routines | CNugteren |
2015-09-17 | Added script to generate API interface and implementation automatically | CNugteren |
2015-09-14 | Added xDOT/xDOTU/xDOTC dot-product routines | CNugteren |
2015-09-14 | Added extra temporary buffer to tuners in preparation of Xdot routines | CNugteren |
2015-09-14 | Added support for the dot buffer and offset argument | CNugteren |
2015-08-22 | Added the XSWAP, XSCAL and XCOPY level-1 routines | CNugteren |
2015-08-20 | Merge pull request #23 from CNugteren/tuner_database | Cedric Nugteren |
2015-08-19 | Add check for supported precision to the tuners | CNugteren |
2015-08-19 | Moved precision tester to utilities | CNugteren |
2015-08-19 | Added precision to the JSON output | CNugteren |
2015-08-13 | Added all supported routines to the C API | CNugteren |
2015-08-13 | Added initial version of C API with just one routine | CNugteren |
2015-08-13 | Added argument m,n,k metadata to JSON files | CNugteren |
2015-08-09 | Refactored the tuners, added JSON output | CNugteren |
2015-08-04 | Added distinguished names for GEMV inherited HEMV/SYMV | CNugteren |
2015-07-31 | Added HEMV routine | CNugteren |
2015-07-31 | Added SYMV routine | CNugteren |
2015-07-27 | Now using the new Claduc C++11 OpenCL header | CNugteren |
2015-07-22 | Set the correct name for AMD OpenCL devices | CNugteren |
2015-07-22 | Updated GEMM tuning results for Tahiti | CNugteren |
2015-07-22 | Added workgroup shuffle option to transpose kernel for AMD GPUs | CNugteren |
2015-07-19 | Kernel caching is now based on a routine's name | CNugteren |
2015-07-19 | The kernel source string is now a routine's member variable | CNugteren |
2015-07-19 | Fixed complex performance on Intel Iris | CNugteren |
2015-07-13 | Updated interface of the PadCopyTransposeMatrix method | CNugteren |
2015-07-12 | Added subfolders for the level1/2/3 routines | CNugteren |
2015-07-12 | Added the HEMM routine, tester, and client | CNugteren |