summaryrefslogtreecommitdiff
path: root/src/tuning/kernels/xgemv.cpp
AgeCommit message (Collapse)Author
2017-03-14Added the possibility to tune batched kernelsCedric Nugteren
2016-11-27Made it possible to use the command-line environmental variables for each ↵Cedric Nugteren
executable and without re-running CMake
2016-10-22Moved files around a bit; created a utilities subfolderCedric Nugteren
2016-10-02Set the default number of runs for all kernels to at least 2 runsCedric Nugteren
2016-10-01Added default num-runs to the tuner adding averaging over 10 runs as a ↵Cedric Nugteren
default for the GEMM direct kernel
2016-07-25Moved the XgemvFast and XgemvFastRot tuning database into a separate fileCedric Nugteren
2016-07-23Fixe a bug in the new XgemvFastRot kernel related to local memory sizeCedric Nugteren
2016-07-23Further improvements to the XgemvFastRot kernel, properly enables coalescing nowCedric Nugteren
2016-07-23Improved the XgemvFastRot kernel by tiled loading of the input matrix A, ↵Cedric Nugteren
enabling better memory performance
2016-07-10Now passing alpha/beta to the kernel as arguments as before fp16 support; in ↵Cedric Nugteren
case of fp16 arguments are cast on host and in kernel
2016-06-19Renamed all C++ source files to .cpp to match the .hpp extension betterCedric Nugteren