summaryrefslogtreecommitdiff
path: root/src/tuning/kernels
AgeCommit message (Expand)Author
2021-03-13set the correct flop count for xgemmJishinMaster
2020-02-17Catches all exceptions of the tunersCedric Nugteren
2018-12-31Added the forgotten batch dimension to the tuner to get correct kernel execut...Cedric Nugteren
2018-12-18Fix the xconvgemm tunerKoichi Akabe
2018-12-18Added first version of a tuner for the ConvGemm direct kernelCedric Nugteren
2018-09-15Fixed an MSVC compilation error due to large stringsCedric Nugteren
2018-07-28Added print statements to indicate the 4 stages of GEMM tuningCedric Nugteren
2018-04-07Extended the GEMM tuner to be able to tune the new 'kernel 1'Cedric Nugteren
2018-03-30Added argument checking for the GEMM tuner: expects m/n to be multiples of MW...Cedric Nugteren
2018-03-22Added the OpenCL local memory size constraint to the tunersCedric Nugteren
2018-03-10Fixed a few things for the new tuning APICedric Nugteren
2018-03-06Fixed compilation issue in Xger tunerCedric Nugteren
2018-03-03Separate kernel tuners in .cpp with main and .hpp with settingsCedric Nugteren
2018-02-20Fixed several issues in the new invert tunerCedric Nugteren
2018-01-25Changed the default number of runs for the GEMV tuner to fix issues for FP16Cedric Nugteren
2017-12-23Fixed unused variable warnings showing up with ClangCedric Nugteren
2017-12-23Split the invert kernel in two parts to prevent error C1091 in MSVC 2013Cedric Nugteren
2017-12-19Added skeleton for a tuner for the invert kernelCedric Nugteren
2017-12-18Reformatted tuning code to make compilation fasterCedric Nugteren
2017-12-10Split GEMM kernel in 4 files instead of 3 due to MSVC 2013 string length limitCedric Nugteren
2017-11-19Modified the kernel tuners to use the newly integrated auto-tunerCedric Nugteren
2017-10-03Gemm in-direct implementation now uses only 1 larger instead of max 3 optiona...Cedric Nugteren
2017-09-30Refactored the tuning architecture: less duplicate now; more defaultsCedric Nugteren
2017-08-31Fixed some things in the tuner: bugs, style, and defaults to random searchCedric Nugteren
2017-08-21Minor updates after merging in the PSO addition to the tunersCedric Nugteren
2017-08-21Remove multistrategy and related functionsmcian
2017-08-09Revert the xgemm strategy to default. If user wants to use multistrategy can ...mcian
2017-08-09Use cltune::SearchMethod enum instead of int valuesmcian
2017-07-23Code refactoringmcian
2017-07-17Add PSO parameters support and search strategy selection from command linemcian
2017-05-11Re-added random tuning for GEMM after accidental removalCedric Nugteren
2017-04-22Increased the default number of runs for the tuner from 2 up to 10 for fast k...Cedric Nugteren
2017-04-21Increased the default number of runs for GEMV tuning; updated GEMV tuning res...Cedric Nugteren
2017-04-17Fixed a namespace clash with CUDA FP16 for the half-datatypeCedric Nugteren
2017-04-14Added a new Xaxpy kernel in between the regular and fast version inCedric Nugteren
2017-03-14Added the possibility to tune batched kernelsCedric Nugteren
2016-11-27Made it possible to use the command-line environmental variables for each exe...Cedric Nugteren
2016-10-22Moved files around a bit; created a utilities subfolderCedric Nugteren
2016-10-03Re-organised GEMM direct kernel and added faster fall-back version for incomp...Cedric Nugteren
2016-10-02Set the default number of runs for all kernels to at least 2 runsCedric Nugteren
2016-10-02Specialised the GEMM direct kernel in four ways for transposing/non-transposi...Cedric Nugteren
2016-10-02Split the GEMM direct kernel into two files; set the default tuning target to...Cedric Nugteren
2016-10-01Added padding to the local memory of the GEMM direct kernelCedric Nugteren
2016-10-01Added default num-runs to the tuner adding averaging over 10 runs as a defaul...Cedric Nugteren
2016-10-01Merge branch 'development' into gemm_directCedric Nugteren
2016-09-27Fixed the local memory size computation for the GEMM tunersCedric Nugteren
2016-09-25Added a first version of a tuner for the GEMM direct kernel; collapsed MWGD, ...Cedric Nugteren
2016-09-12Split the XGEMM kernel further up: now in 3 parts. This is done because MSVC ...Cedric Nugteren
2016-09-06Split GEMM tuning in two parts: a small set of tuning parameters which is exp...Cedric Nugteren
2016-08-21Increased the ratio of GEMM tuning results to explore; reduced the tuning sea...Cedric Nugteren