summaryrefslogtreecommitdiff
path: root/src/routines
AgeCommit message (Collapse)Author
2016-06-27Fixes for the AppVeyor Windows buildCedric Nugteren
2016-06-19Renamed all C++ source files to .cpp to match the .hpp extension betterCedric Nugteren
2016-06-18Moved all headers into the source tree, changed headers to .hpp extensionCedric Nugteren
2016-06-18Clean-up of the routine class, moved RunKernel to the routine/common fileCedric Nugteren
2016-06-18Removed the template from the Routine base-classCedric Nugteren
2016-06-17Removed the precision argument from the routines in favor of a single ↵Cedric Nugteren
templated function
2016-06-17Removed the interface to the cache functions from the Routine class, calls ↵Cedric Nugteren
them directly now
2016-06-17Moved the RunKernel and PadCopyTransposeMatrix functions out of the Routine ↵Cedric Nugteren
class
2016-06-17Moved the test-for-valid-buffers function from the Routine class to separate ↵Cedric Nugteren
functions in a separate file
2016-06-16Added XOMATCOPY routines to perform out-of-place matrix scaling, copying, ↵Cedric Nugteren
and/or transposing
2016-06-15Added some constness to variables related to the GEMM routinesCedric Nugteren
2016-06-14Re-organised the level-3 supporting kernels (copy, pad, transpose, convert) ↵Cedric Nugteren
and renamed files and functions appropriately
2016-05-25Added level-3 half-precision routines HGEMM/HSYMM/HSYRK/HSYR2K/HTRMMCedric Nugteren
2016-05-22Added level-2 half-precision routines HGER/HSYR/HSPR/HSYR2/HSPR2Cedric Nugteren
2016-05-22Prepared the GER kernels and tuner for half-precision supportCedric Nugteren
2016-05-22Added level-2 half-precision routines ↵Cedric Nugteren
HGEMV/HGBMV/HHEMV/HHBMV/HHPMV/HSYMV/HSBMV/HSPMV/HTRMV/HTBMV/HTPMV
2016-05-22Prepared the GEMV kernels and tuner for half-precision supportCedric Nugteren
2016-05-22Added half-precision support for all level 1 routinesCedric Nugteren
2016-05-18Merged in latest changes from 0.7.1 releaseCedric Nugteren
2016-05-16Prepared GEMM and supporting kernels and tuners for half-precision supportCedric Nugteren
2016-05-14Set kernel arguments for AXPY as constant memory buffers, making it possible ↵Cedric Nugteren
to transfer half-precision values as well
2016-05-13Initial experimental version of the half-precision HAXPY routineCedric Nugteren
2016-05-01Changed the index buffer of IxAMAX routines to unsigned int for proper ↵Cedric Nugteren
buffersize checking
2016-04-28Fixed the cache to store binaries instead of OpenCL programsCedric Nugteren
2016-04-20Added support for the iSAMAX/iDAMAX/iCAMAX/iZAMAX routinescnugteren
2016-04-14Added support for the SASUM/DASUM/ScASUM/DzASUM routinescnugteren
2016-04-09Events are now properly implemented using event waiting list and asking the ↵cnugteren
user to wait for event completion
2016-04-04Removed redundant queue synchronisation statementscnugteren
2016-03-28Added preliminary support for the xNRM2 routinesCedric Nugteren
2016-03-06Fixed a bug in the GER-family of routines due to incorrect division of the ↵Cedric Nugteren
workgroup size
2016-03-06Added preliminary support for xHPR2 and xSPR2 routinesCedric Nugteren
2016-03-02Added preliminary support for xHER2 and xSYR2 routinesCedric Nugteren
2016-02-28Fixed a couple of correctness bugs in the Xher kernelsCedric Nugteren
2016-02-28Added support for xHER, xHPR, xSYR, and xSPR routinesCedric Nugteren
2016-02-20Added support for xGERU and xGERC routinesCedric Nugteren
2016-02-20Added XGER routine, kernel, and tunerCedric Nugteren
2016-02-08Separated the GEMM kernel in two parts to reduce string length for MSVCCedric Nugteren
2016-02-08Split-up the XGEMV kernel in two partsCedric Nugteren
2016-01-30Added first auto-generated database headers from the Python database; only ↵Cedric Nugteren
K40 and Iris supported now
2015-10-12Routine names are now all default arguments defined in the headerCNugteren
2015-10-12Moved level3 kernel files to a subfolderCNugteren
2015-09-26Added TRMV/TBMV/TPMV routinesCNugteren
2015-09-19Added SBMV and SPMV routinesCNugteren
2015-09-19Added the HPMV routineCNugteren
2015-09-19Added the HBMV routineCNugteren
2015-09-18Improved the organization and performance of level 2 routinesCNugteren
2015-09-18Added first version of banded matrix-vector multiplicationCNugteren
2015-09-14Added xDOT/xDOTU/xDOTC dot-product routinesCNugteren
2015-08-22Added the XSWAP, XSCAL and XCOPY level-1 routinesCNugteren
2015-08-22Re-organized level1 xaxpy kernelCNugteren