summaryrefslogtreecommitdiff
path: root/src/tuning
AgeCommit message (Expand)Author
2016-06-19Renamed all C++ source files to .cpp to match the .hpp extension betterCedric Nugteren
2016-06-18Moved all headers into the source tree, changed headers to .hpp extensionCedric Nugteren
2016-06-16Added XOMATCOPY routines to perform out-of-place matrix scaling, copying, and...Cedric Nugteren
2016-06-14Re-organised the level-3 supporting kernels (copy, pad, transpose, convert) a...Cedric Nugteren
2016-05-22Prepared the GER kernels and tuner for half-precision supportCedric Nugteren
2016-05-22Prepared the GEMV kernels and tuner for half-precision supportCedric Nugteren
2016-05-22Added half-precision support for all level 1 routinesCedric Nugteren
2016-05-16Prepared GEMM and supporting kernels and tuners for half-precision supportCedric Nugteren
2016-05-15Added header with conversions from and to half-precision floating-pointCedric Nugteren
2016-05-13Initial experimental version of the half-precision HAXPY routineCedric Nugteren
2016-05-01Made the default xDOT tuning size smallerCedric Nugteren
2016-04-14Updated the reduction-kernel tuner to also tune the epiloguecnugteren
2016-02-28Added support for xHER, xHPR, xSYR, and xSPR routinesCedric Nugteren
2016-02-20Added XGER routine, kernel, and tunerCedric Nugteren
2016-02-08Separated the GEMM kernel in two parts to reduce string length for MSVCCedric Nugteren
2016-02-08Split-up the XGEMV kernel in two partsCedric Nugteren
2016-02-06Reduced the maximum workgroup-size for GEMV kernels furtherCNugteren
2016-02-06Reduced unrolling factor in xgemv kernel to reduce compilation timesCNugteren
2015-10-28Now sets local memory size in xgemv tuner properlyCNugteren
2015-10-25Fixed an arguments-related bug in the GEMV tunerCNugteren
2015-10-12Moved level3 kernel files to a subfolderCNugteren
2015-09-18Added first version of banded matrix-vector multiplicationCNugteren
2015-09-14Added xDOT/xDOTU/xDOTC dot-product routinesCNugteren
2015-09-14Added extra temporary buffer to tuners in preparation of Xdot routinesCNugteren
2015-08-22Re-organized level1 xaxpy kernelCNugteren
2015-08-09Refactored the tuners, added JSON outputCNugteren
2015-07-22Added workgroup shuffle option to transpose kernel for AMD GPUsCNugteren
2015-07-19The kernel source string is now a routine's member variableCNugteren
2015-06-16Added support for conjugate transpose in GEMVCNugteren
2015-06-16Updated the tuners to set the conjugate argumentCNugteren
2015-06-14Split the three variations of the GEMV kernel for maximal tuning freedomCNugteren
2015-06-13Added a fast GEMV kernel with vector loads, no tail, and fewer if-statementsCNugteren
2015-06-13Improved GEMV kernel with local memory and a tunable WPTCNugteren
2015-06-13Added initial version of GEMV including tester and performance clientCNugteren
2015-06-10Added initial naive version of Xgemv kernelCNugteren
2015-05-30Initial commit of preview versionCNugteren