summaryrefslogtreecommitdiff
path: root/src/tuning
AgeCommit message (Expand)Author
2021-05-22Fix issue with printing out-of-bounds local/global sizes for level 1 tunersCedric Nugteren
2021-03-13set the correct flop count for xgemmJishinMaster
2021-01-20Use reference types to prevent unnecessary copyingJerry James
2020-05-11Increase display width of the local/global sizesCedric Nugteren
2020-05-10Made sure that the global workgroup size is a multiple of the local size in t...Cedric Nugteren
2020-05-10Added logging of local/global workgroup sizes when run the tunersCedric Nugteren
2020-05-03Move queue creation out of the tuner loopCedric Nugteren
2020-02-17Catches all exceptions of the tunersCedric Nugteren
2018-12-31Added support for the convgemm tuner in the tuner databaseCedric Nugteren
2018-12-31Added the forgotten batch dimension to the tuner to get correct kernel execut...Cedric Nugteren
2018-12-18Fix the xconvgemm tunerKoichi Akabe
2018-12-18Added first version of a tuner for the ConvGemm direct kernelCedric Nugteren
2018-09-15Fixed an MSVC compilation error due to large stringsCedric Nugteren
2018-07-28Added print statements to indicate the 4 stages of GEMM tuningCedric Nugteren
2018-07-28The tuners now also check for valid local thread configurations and skip inva...Cedric Nugteren
2018-07-25Added code to report the average tuning resultsCedric Nugteren
2018-05-19Added an option to run the routine tuner for a single specific GEMM sizeCedric Nugteren
2018-05-19Fixed compilation issuesCedric Nugteren
2018-05-19The GEMM routine tuner now loads kernel JSON tuning results from disk if avai...Cedric Nugteren
2018-05-17Added a canary region for overflow detection to the tunersCedric Nugteren
2018-04-07Extended the GEMM tuner to be able to tune the new 'kernel 1'Cedric Nugteren
2018-03-30Added argument checking for the GEMM tuner: expects m/n to be multiples of MW...Cedric Nugteren
2018-03-22Added the OpenCL local memory size constraint to the tunersCedric Nugteren
2018-03-21Re-added support for local memory size constraint checking in the tunerCedric Nugteren
2018-03-10Fixed an issue for DLL linking under WindowsCedric Nugteren
2018-03-10Fixed a few things for the new tuning APICedric Nugteren
2018-03-10Completed the API for all tuneable kernelsCedric Nugteren
2018-03-09Added several more tuner API functionsCedric Nugteren
2018-03-06Fixed compilation issue in Xger tunerCedric Nugteren
2018-03-06First version of the tuning API, added interface for copy-kernel, added sampleCedric Nugteren
2018-03-03Separate kernel tuners in .cpp with main and .hpp with settingsCedric Nugteren
2018-02-20Fixed several issues in the new invert tunerCedric Nugteren
2018-01-25Moved some constants from global scope to a function; removed unnecessary inc...Cedric Nugteren
2018-01-25Changed the default number of runs for the GEMV tuner to fix issues for FP16Cedric Nugteren
2018-01-18Made GEMM routine tuning a bit more generic in preparation of possible separa...Cedric Nugteren
2018-01-15Factored out the generic parts of the GEMM routine tunerCedric Nugteren
2018-01-06Fixed a vendor naming bug in the tuners and in the databaseCedric Nugteren
2017-12-23Fixed unused variable warnings showing up with ClangCedric Nugteren
2017-12-23Now calling main TRSV routine again to fix compilation in MSVCCedric Nugteren
2017-12-23Split the invert kernel in two parts to prevent error C1091 in MSVC 2013Cedric Nugteren
2017-12-23Updated the database to use the new TRSV and Invert tunersCedric Nugteren
2017-12-23Added TRSV block-size tunerCedric Nugteren
2017-12-19Added skeleton for a tuner for the invert kernelCedric Nugteren
2017-12-18Reformatted tuning code to make compilation fasterCedric Nugteren
2017-12-17Fixed an issue with the tuner: it was using platform vendor rather than devic...Cedric Nugteren
2017-12-17Fixed an unnecessary overflow issue on 32-bit systemsCedric Nugteren
2017-12-10Split GEMM kernel in 4 files instead of 3 due to MSVC 2013 string length limitCedric Nugteren
2017-12-10Fixed an issue in the tuners to prevent error -14 from persisting (CL_EXEC_ST...Cedric Nugteren
2017-12-09Made the pre-processor run by default for ARM and Qualcomm GPUsCedric Nugteren
2017-11-30Integrated pre-processor in compilation flow, default is still disabledCedric Nugteren