summaryrefslogtreecommitdiff
path: root/src/tuning/kernels/xgemm.cpp
AgeCommit message (Collapse)Author
2017-04-17Fixed a namespace clash with CUDA FP16 for the half-datatypeCedric Nugteren
2017-03-14Added the possibility to tune batched kernelsCedric Nugteren
2016-11-27Made it possible to use the command-line environmental variables for each ↵Cedric Nugteren
executable and without re-running CMake
2016-10-22Moved files around a bit; created a utilities subfolderCedric Nugteren
2016-10-02Set the default number of runs for all kernels to at least 2 runsCedric Nugteren
2016-10-01Added default num-runs to the tuner adding averaging over 10 runs as a ↵Cedric Nugteren
default for the GEMM direct kernel
2016-09-27Fixed the local memory size computation for the GEMM tunersCedric Nugteren
2016-09-12Split the XGEMM kernel further up: now in 3 parts. This is done because MSVC ↵Cedric Nugteren
can't handle long strings
2016-09-06Split GEMM tuning in two parts: a small set of tuning parameters which is ↵Cedric Nugteren
explored exhaustively and a larger set which is explored randomly
2016-08-21Increased the ratio of GEMM tuning results to explore; reduced the tuning ↵Cedric Nugteren
search space to have a better chance to evaluate more likely parameter combinations
2016-07-10Now passing alpha/beta to the kernel as arguments as before fp16 support; in ↵Cedric Nugteren
case of fp16 arguments are cast on host and in kernel
2016-06-19Renamed all C++ source files to .cpp to match the .hpp extension betterCedric Nugteren