summaryrefslogtreecommitdiff
path: root/src/tuning
AgeCommit message (Collapse)Author
2017-11-19Added compilation timing and better compilation error reportingCedric Nugteren
2017-11-19Some fixed for the new auto-tuner to be compatible with the Python scriptsCedric Nugteren
2017-11-19Revived the GEMM routine tuner; minor formatting changesCedric Nugteren
2017-11-19Modified the kernel tuners to use the newly integrated auto-tunerCedric Nugteren
2017-11-17Moved some tuning functions from .hpp to .cppCedric Nugteren
2017-11-17Moved compilation function to separate file; removed dependency of tuners of ↵Cedric Nugteren
the CLBlast library
2017-11-16Added printing of the best parameters for the new tunerCedric Nugteren
2017-11-15Added first version of integrated and re-written auto-tunerCedric Nugteren
2017-11-06Changed GEMM routine tuner's scoring to use L2 measure instead for better ↵Cedric Nugteren
averaging
2017-11-02Integrated the GEMM routine tuner for kernel selection; added first tuning ↵Cedric Nugteren
results
2017-10-30Added collecting and printing of scores for the kernel-selection tunerCedric Nugteren
2017-10-28Added initial version of a GEMM kernel selection tunerCedric Nugteren
2017-10-03Gemm in-direct implementation now uses only 1 larger instead of max 3 ↵Cedric Nugteren
optional temporary buffers
2017-09-30Refactored the tuning architecture: less duplicate now; more defaultsCedric Nugteren
2017-09-10Added the new vendor-architecture-name hierarchy to the tuners as wellCedric Nugteren
2017-08-31Fixed some things in the tuner: bugs, style, and defaults to random searchCedric Nugteren
2017-08-21Minor updates after merging in the PSO addition to the tunersCedric Nugteren
2017-08-21Remove multistrategy and related functionsmcian
2017-08-09Revert the xgemm strategy to default. If user wants to use multistrategy can ↵mcian
simple call the function TestHeuristic from the main
2017-08-09Use cltune::SearchMethod enum instead of int valuesmcian
2017-07-23Code refactoringmcian
2017-07-17Add PSO parameters support and search strategy selection from command linemcian
2017-05-11Re-added random tuning for GEMM after accidental removalCedric Nugteren
2017-04-22Increased the default number of runs for the tuner from 2 up to 10 for fast ↵Cedric Nugteren
kernels
2017-04-21Increased the default number of runs for GEMV tuning; updated GEMV tuning ↵Cedric Nugteren
results for Iris Pro
2017-04-17Fixed a namespace clash with CUDA FP16 for the half-datatypeCedric Nugteren
2017-04-14Added a new Xaxpy kernel in between the regular and fast version inCedric Nugteren
2017-03-14Added the possibility to tune batched kernelsCedric Nugteren
2017-03-05Changed the way the test-data is generated: now using a single MT generator ↵Cedric Nugteren
and distribution for all data
2016-11-27Made it possible to use the command-line environmental variables for each ↵Cedric Nugteren
executable and without re-running CMake
2016-10-22Moved files around a bit; created a utilities subfolderCedric Nugteren
2016-10-03Re-organised GEMM direct kernel and added faster fall-back version for ↵Cedric Nugteren
incomplete rectangles
2016-10-02Set the default number of runs for all kernels to at least 2 runsCedric Nugteren
2016-10-02Specialised the GEMM direct kernel in four ways for ↵Cedric Nugteren
transposing/non-transposing: NN, NT, TN, TT
2016-10-02Split the GEMM direct kernel into two files; set the default tuning target ↵Cedric Nugteren
to 256-256-256
2016-10-01Added padding to the local memory of the GEMM direct kernelCedric Nugteren
2016-10-01Added default num-runs to the tuner adding averaging over 10 runs as a ↵Cedric Nugteren
default for the GEMM direct kernel
2016-10-01Merge branch 'development' into gemm_directCedric Nugteren
2016-09-27Added an option to run tuned kernels multiple times to average execution ↵Cedric Nugteren
times; requires CLTune 2.5.0
2016-09-27Fixed the local memory size computation for the GEMM tunersCedric Nugteren
2016-09-27Now generates test/client/tuner data using a fixed seed to enable ↵Cedric Nugteren
reproducability of results
2016-09-25Added a first version of a tuner for the GEMM direct kernel; collapsed MWGD, ↵Cedric Nugteren
NWGD and KWGD into one WGD parameter
2016-09-12Split the XGEMM kernel further up: now in 3 parts. This is done because MSVC ↵Cedric Nugteren
can't handle long strings
2016-09-06Split GEMM tuning in two parts: a small set of tuning parameters which is ↵Cedric Nugteren
explored exhaustively and a larger set which is explored randomly
2016-08-21Increased the ratio of GEMM tuning results to explore; reduced the tuning ↵Cedric Nugteren
search space to have a better chance to evaluate more likely parameter combinations
2016-07-25Moved the XgemvFast and XgemvFastRot tuning database into a separate fileCedric Nugteren
2016-07-23Fixe a bug in the new XgemvFastRot kernel related to local memory sizeCedric Nugteren
2016-07-23Further improvements to the XgemvFastRot kernel, properly enables coalescing nowCedric Nugteren
2016-07-23Improved the XgemvFastRot kernel by tiled loading of the input matrix A, ↵Cedric Nugteren
enabling better memory performance
2016-07-10Now passing alpha/beta to the kernel as arguments as before fp16 support; in ↵Cedric Nugteren
case of fp16 arguments are cast on host and in kernel