summaryrefslogtreecommitdiff
AgeCommit message (Expand)Author
2016-02-07Made the tuning database an optional external downloadCedric Nugteren
2016-02-06Made the database script compatible with Python 3CNugteren
2016-02-06Reduced the maximum workgroup-size for GEMV kernels furtherCNugteren
2016-02-06Changed the order of tuners in the alltuners targetCedric Nugteren
2016-02-06Reduced unrolling factor in xgemv kernel to reduce compilation timesCNugteren
2016-02-06Fixed a linker error in the performance client under GCCCNugteren
2016-01-30Fixes for compilation under Visual StudioCNugteren
2016-01-30Prepared for MSVC supportCedric Nugteren
2016-01-30Fixed a bug in the graph scripts (thanks to Victor Pakhomov)Cedric Nugteren
2016-01-30Updated to version 4.0 of the CLCudaAPI headerCedric Nugteren
2016-01-30Merge branch 'tuning_database' into developmentCedric Nugteren
2016-01-30Added first auto-generated database headers from the Python database; only K4...Cedric Nugteren
2016-01-24Minor improvements to the database script, including proper file pathsCedric Nugteren
2016-01-24Added Python function to compute defaults for a particular device/vendor comb...Cedric Nugteren
2016-01-23Updated FindOpenCL for Intel Linux OpenCL pathsCedric Nugteren
2015-10-28Added tuning data for Tesla K40CNugteren
2015-10-28Now sets local memory size in xgemv tuner properlyCNugteren
2015-10-25Added initial tuning database with Intel Iris dataCNugteren
2015-10-25Updated tuning database script according to the new JSON formatCNugteren
2015-10-25Fixed an arguments-related bug in the GEMV tunerCNugteren
2015-10-25Moved the tuner database script to a separate folderCNugteren
2015-10-23Added alpha and beta to tuner meta-dataCNugteren
2015-10-17Prepared the changelog for the next releaseCNugteren
2015-10-17Updated to version 0.5.0CNugteren
2015-10-17Travis now also build the development branchCNugteren
2015-10-17Merge pull request #28 from CNugteren/kernels_reorganizationCedric Nugteren
2015-10-13Added guards for routine-specific level-3 pad kernelsCNugteren
2015-10-12Routine names are now all default arguments defined in the headerCNugteren
2015-10-12Moved level3 kernel files to a subfolderCNugteren
2015-09-26Merge pull request #27 from CNugteren/level2_matrix_vectorCedric Nugteren
2015-09-26Added TRMV/TBMV/TPMV routinesCNugteren
2015-09-26Made buffer copying a const-method for the sourceCNugteren
2015-09-19Added SBMV and SPMV routinesCNugteren
2015-09-19Added the HPMV routineCNugteren
2015-09-19Added infrastructure for packed matricesCNugteren
2015-09-19Added the HBMV routineCNugteren
2015-09-18Improved the organization and performance of level 2 routinesCNugteren
2015-09-18Added first version of banded matrix-vector multiplicationCNugteren
2015-09-18Merge pull request #26 from CNugteren/routine_definitionsCedric Nugteren
2015-09-18Added generated main functions for correctness/performance tests for level 2 ...CNugteren
2015-09-17Added interface of all level 2 routinesCNugteren
2015-09-17Added script to generate API interface and implementation automaticallyCNugteren
2015-09-14Made Travis always build pushes to the master branchCNugteren
2015-09-14Merge pull request #25 from CNugteren/level1_routinesCedric Nugteren
2015-09-14Removed routines from the table which are not supported by clBLASCNugteren
2015-09-14Added xDOT/xDOTU/xDOTC dot-product routinesCNugteren
2015-09-14Added extra temporary buffer to tuners in preparation of Xdot routinesCNugteren
2015-09-14Added support for the dot buffer and offset argumentCNugteren
2015-08-24Minor update of options-printing syntaxCNugteren
2015-08-22Added the XSWAP, XSCAL and XCOPY level-1 routinesCNugteren