summaryrefslogtreecommitdiff
path: root/src/tuning
AgeCommit message (Expand)Author
2017-12-23Fixed unused variable warnings showing up with ClangCedric Nugteren
2017-12-23Now calling main TRSV routine again to fix compilation in MSVCCedric Nugteren
2017-12-23Split the invert kernel in two parts to prevent error C1091 in MSVC 2013Cedric Nugteren
2017-12-23Updated the database to use the new TRSV and Invert tunersCedric Nugteren
2017-12-23Added TRSV block-size tunerCedric Nugteren
2017-12-19Added skeleton for a tuner for the invert kernelCedric Nugteren
2017-12-18Reformatted tuning code to make compilation fasterCedric Nugteren
2017-12-17Fixed an issue with the tuner: it was using platform vendor rather than devic...Cedric Nugteren
2017-12-17Fixed an unnecessary overflow issue on 32-bit systemsCedric Nugteren
2017-12-10Split GEMM kernel in 4 files instead of 3 due to MSVC 2013 string length limitCedric Nugteren
2017-12-10Fixed an issue in the tuners to prevent error -14 from persisting (CL_EXEC_ST...Cedric Nugteren
2017-12-09Made the pre-processor run by default for ARM and Qualcomm GPUsCedric Nugteren
2017-11-30Integrated pre-processor in compilation flow, default is still disabledCedric Nugteren
2017-11-20Fixes some displaying issues in the GEMM routine tunerCedric Nugteren
2017-11-19Fixed a variety of warnings and an error for MSVC2013 compilationCedric Nugteren
2017-11-19Added compilation timing and better compilation error reportingCedric Nugteren
2017-11-19Some fixed for the new auto-tuner to be compatible with the Python scriptsCedric Nugteren
2017-11-19Revived the GEMM routine tuner; minor formatting changesCedric Nugteren
2017-11-19Modified the kernel tuners to use the newly integrated auto-tunerCedric Nugteren
2017-11-17Moved some tuning functions from .hpp to .cppCedric Nugteren
2017-11-17Moved compilation function to separate file; removed dependency of tuners of ...Cedric Nugteren
2017-11-16Added printing of the best parameters for the new tunerCedric Nugteren
2017-11-15Added first version of integrated and re-written auto-tunerCedric Nugteren
2017-11-06Changed GEMM routine tuner's scoring to use L2 measure instead for better ave...Cedric Nugteren
2017-11-02Integrated the GEMM routine tuner for kernel selection; added first tuning re...Cedric Nugteren
2017-10-30Added collecting and printing of scores for the kernel-selection tunerCedric Nugteren
2017-10-28Added initial version of a GEMM kernel selection tunerCedric Nugteren
2017-10-03Gemm in-direct implementation now uses only 1 larger instead of max 3 optiona...Cedric Nugteren
2017-09-30Refactored the tuning architecture: less duplicate now; more defaultsCedric Nugteren
2017-09-10Added the new vendor-architecture-name hierarchy to the tuners as wellCedric Nugteren
2017-08-31Fixed some things in the tuner: bugs, style, and defaults to random searchCedric Nugteren
2017-08-21Minor updates after merging in the PSO addition to the tunersCedric Nugteren
2017-08-21Remove multistrategy and related functionsmcian
2017-08-09Revert the xgemm strategy to default. If user wants to use multistrategy can ...mcian
2017-08-09Use cltune::SearchMethod enum instead of int valuesmcian
2017-07-23Code refactoringmcian
2017-07-17Add PSO parameters support and search strategy selection from command linemcian
2017-05-11Re-added random tuning for GEMM after accidental removalCedric Nugteren
2017-04-22Increased the default number of runs for the tuner from 2 up to 10 for fast k...Cedric Nugteren
2017-04-21Increased the default number of runs for GEMV tuning; updated GEMV tuning res...Cedric Nugteren
2017-04-17Fixed a namespace clash with CUDA FP16 for the half-datatypeCedric Nugteren
2017-04-14Added a new Xaxpy kernel in between the regular and fast version inCedric Nugteren
2017-03-14Added the possibility to tune batched kernelsCedric Nugteren
2017-03-05Changed the way the test-data is generated: now using a single MT generator a...Cedric Nugteren
2016-11-27Made it possible to use the command-line environmental variables for each exe...Cedric Nugteren
2016-10-22Moved files around a bit; created a utilities subfolderCedric Nugteren
2016-10-03Re-organised GEMM direct kernel and added faster fall-back version for incomp...Cedric Nugteren
2016-10-02Set the default number of runs for all kernels to at least 2 runsCedric Nugteren
2016-10-02Specialised the GEMM direct kernel in four ways for transposing/non-transposi...Cedric Nugteren
2016-10-02Split the GEMM direct kernel into two files; set the default tuning target to...Cedric Nugteren