Age | Commit message (Expand) | Author |
2016-07-25 | Moved the XgemvFast and XgemvFastRot tuning database into a separate file | Cedric Nugteren |
2016-07-24 | Merge branch 'development' into gemv_performance | Cedric Nugteren |
2016-07-23 | Improved the XgemvFastRot kernel by tiled loading of the input matrix A, enab... | Cedric Nugteren |
2016-07-22 | clblast::RunKernel, cl::Kernel: unify variants with/without waitForEvents, su... | Ivan Shapovalov |
2016-07-22 | clblast::RunKernel, cl::Kernel: take const vector as waitForEvents | Ivan Shapovalov |
2016-07-22 | xgemm: do not hardcode kernel requirements for internal matrix layout | Ivan Shapovalov |
2016-07-16 | Removed an unused variable from the copy-transpose-pad function | Cedric Nugteren |
2016-07-10 | Now passing alpha/beta to the kernel as arguments as before fp16 support; in ... | Cedric Nugteren |
2016-07-06 | Added a VERBOSE mode to debug performance: now prints details about compilati... | Cedric Nugteren |
2016-06-27 | Fixes for the AppVeyor Windows build | Cedric Nugteren |
2016-06-19 | Renamed all C++ source files to .cpp to match the .hpp extension better | Cedric Nugteren |
2016-06-18 | Moved all headers into the source tree, changed headers to .hpp extension | Cedric Nugteren |
2016-06-18 | Clean-up of the routine class, moved RunKernel to the routine/common file | Cedric Nugteren |
2016-06-18 | Removed the template from the Routine base-class | Cedric Nugteren |
2016-06-17 | Removed the precision argument from the routines in favor of a single templat... | Cedric Nugteren |
2016-06-17 | Removed the interface to the cache functions from the Routine class, calls th... | Cedric Nugteren |
2016-06-17 | Moved the RunKernel and PadCopyTransposeMatrix functions out of the Routine c... | Cedric Nugteren |
2016-06-17 | Moved the test-for-valid-buffers function from the Routine class to separate ... | Cedric Nugteren |
2016-06-16 | Added XOMATCOPY routines to perform out-of-place matrix scaling, copying, and... | Cedric Nugteren |
2016-06-15 | Added some constness to variables related to the GEMM routines | Cedric Nugteren |
2016-06-14 | Re-organised the level-3 supporting kernels (copy, pad, transpose, convert) a... | Cedric Nugteren |
2016-05-25 | Added level-3 half-precision routines HGEMM/HSYMM/HSYRK/HSYR2K/HTRMM | Cedric Nugteren |
2016-05-22 | Added level-2 half-precision routines HGER/HSYR/HSPR/HSYR2/HSPR2 | Cedric Nugteren |
2016-05-22 | Prepared the GER kernels and tuner for half-precision support | Cedric Nugteren |
2016-05-22 | Added level-2 half-precision routines HGEMV/HGBMV/HHEMV/HHBMV/HHPMV/HSYMV/HSB... | Cedric Nugteren |
2016-05-22 | Prepared the GEMV kernels and tuner for half-precision support | Cedric Nugteren |
2016-05-22 | Added half-precision support for all level 1 routines | Cedric Nugteren |
2016-05-18 | Merged in latest changes from 0.7.1 release | Cedric Nugteren |
2016-05-16 | Prepared GEMM and supporting kernels and tuners for half-precision support | Cedric Nugteren |
2016-05-14 | Set kernel arguments for AXPY as constant memory buffers, making it possible ... | Cedric Nugteren |
2016-05-13 | Initial experimental version of the half-precision HAXPY routine | Cedric Nugteren |
2016-05-01 | Changed the index buffer of IxAMAX routines to unsigned int for proper buffer... | Cedric Nugteren |
2016-04-28 | Fixed the cache to store binaries instead of OpenCL programs | Cedric Nugteren |
2016-04-20 | Added support for the iSAMAX/iDAMAX/iCAMAX/iZAMAX routines | cnugteren |
2016-04-14 | Added support for the SASUM/DASUM/ScASUM/DzASUM routines | cnugteren |
2016-04-09 | Events are now properly implemented using event waiting list and asking the u... | cnugteren |
2016-04-04 | Removed redundant queue synchronisation statements | cnugteren |
2016-03-28 | Added preliminary support for the xNRM2 routines | Cedric Nugteren |
2016-03-06 | Fixed a bug in the GER-family of routines due to incorrect division of the wo... | Cedric Nugteren |
2016-03-06 | Added preliminary support for xHPR2 and xSPR2 routines | Cedric Nugteren |
2016-03-02 | Added preliminary support for xHER2 and xSYR2 routines | Cedric Nugteren |
2016-02-28 | Fixed a couple of correctness bugs in the Xher kernels | Cedric Nugteren |
2016-02-28 | Added support for xHER, xHPR, xSYR, and xSPR routines | Cedric Nugteren |
2016-02-20 | Added support for xGERU and xGERC routines | Cedric Nugteren |
2016-02-20 | Added XGER routine, kernel, and tuner | Cedric Nugteren |
2016-02-08 | Separated the GEMM kernel in two parts to reduce string length for MSVC | Cedric Nugteren |
2016-02-08 | Split-up the XGEMV kernel in two parts | Cedric Nugteren |
2016-01-30 | Added first auto-generated database headers from the Python database; only K4... | Cedric Nugteren |
2015-10-12 | Routine names are now all default arguments defined in the header | CNugteren |
2015-10-12 | Moved level3 kernel files to a subfolder | CNugteren |