Age | Commit message (Expand) | Author |
2016-07-23 | Improved the XgemvFastRot kernel by tiled loading of the input matrix A, enab... | Cedric Nugteren |
2016-07-10 | Now passing alpha/beta to the kernel as arguments as before fp16 support; in ... | Cedric Nugteren |
2016-06-19 | Renamed all C++ source files to .cpp to match the .hpp extension better | Cedric Nugteren |
2016-06-18 | Moved all headers into the source tree, changed headers to .hpp extension | Cedric Nugteren |
2016-06-16 | Added XOMATCOPY routines to perform out-of-place matrix scaling, copying, and... | Cedric Nugteren |
2016-06-14 | Re-organised the level-3 supporting kernels (copy, pad, transpose, convert) a... | Cedric Nugteren |
2016-05-22 | Prepared the GER kernels and tuner for half-precision support | Cedric Nugteren |
2016-05-22 | Prepared the GEMV kernels and tuner for half-precision support | Cedric Nugteren |
2016-05-22 | Added half-precision support for all level 1 routines | Cedric Nugteren |
2016-05-16 | Prepared GEMM and supporting kernels and tuners for half-precision support | Cedric Nugteren |
2016-05-15 | Added header with conversions from and to half-precision floating-point | Cedric Nugteren |
2016-05-13 | Initial experimental version of the half-precision HAXPY routine | Cedric Nugteren |
2016-05-01 | Made the default xDOT tuning size smaller | Cedric Nugteren |
2016-04-14 | Updated the reduction-kernel tuner to also tune the epilogue | cnugteren |
2016-02-28 | Added support for xHER, xHPR, xSYR, and xSPR routines | Cedric Nugteren |
2016-02-20 | Added XGER routine, kernel, and tuner | Cedric Nugteren |
2016-02-08 | Separated the GEMM kernel in two parts to reduce string length for MSVC | Cedric Nugteren |
2016-02-08 | Split-up the XGEMV kernel in two parts | Cedric Nugteren |
2016-02-06 | Reduced the maximum workgroup-size for GEMV kernels further | CNugteren |
2016-02-06 | Reduced unrolling factor in xgemv kernel to reduce compilation times | CNugteren |
2015-10-28 | Now sets local memory size in xgemv tuner properly | CNugteren |
2015-10-25 | Fixed an arguments-related bug in the GEMV tuner | CNugteren |
2015-10-12 | Moved level3 kernel files to a subfolder | CNugteren |
2015-09-18 | Added first version of banded matrix-vector multiplication | CNugteren |
2015-09-14 | Added xDOT/xDOTU/xDOTC dot-product routines | CNugteren |
2015-09-14 | Added extra temporary buffer to tuners in preparation of Xdot routines | CNugteren |
2015-08-22 | Re-organized level1 xaxpy kernel | CNugteren |
2015-08-09 | Refactored the tuners, added JSON output | CNugteren |
2015-07-22 | Added workgroup shuffle option to transpose kernel for AMD GPUs | CNugteren |
2015-07-19 | The kernel source string is now a routine's member variable | CNugteren |
2015-06-16 | Added support for conjugate transpose in GEMV | CNugteren |
2015-06-16 | Updated the tuners to set the conjugate argument | CNugteren |
2015-06-14 | Split the three variations of the GEMV kernel for maximal tuning freedom | CNugteren |
2015-06-13 | Added a fast GEMV kernel with vector loads, no tail, and fewer if-statements | CNugteren |
2015-06-13 | Improved GEMV kernel with local memory and a tunable WPT | CNugteren |
2015-06-13 | Added initial version of GEMV including tester and performance client | CNugteren |
2015-06-10 | Added initial naive version of Xgemv kernel | CNugteren |
2015-05-30 | Initial commit of preview version | CNugteren |