Age | Commit message (Expand) | Author |
2016-02-28 | Fixed a couple of correctness bugs in the Xher kernels | Cedric Nugteren |
2016-02-28 | Added support for xHER, xHPR, xSYR, and xSPR routines | Cedric Nugteren |
2016-02-20 | Added support for xGERU and xGERC routines | Cedric Nugteren |
2016-02-20 | Added XGER routine, kernel, and tuner | Cedric Nugteren |
2016-02-08 | Separated the GEMM kernel in two parts to reduce string length for MSVC | Cedric Nugteren |
2016-02-08 | Split-up the XGEMV kernel in two parts | Cedric Nugteren |
2016-02-06 | Reduced unrolling factor in xgemv kernel to reduce compilation times | CNugteren |
2015-10-13 | Added guards for routine-specific level-3 pad kernels | CNugteren |
2015-10-12 | Moved level3 kernel files to a subfolder | CNugteren |
2015-09-26 | Added TRMV/TBMV/TPMV routines | CNugteren |
2015-09-19 | Added SBMV and SPMV routines | CNugteren |
2015-09-19 | Added the HPMV routine | CNugteren |
2015-09-19 | Added the HBMV routine | CNugteren |
2015-09-18 | Improved the organization and performance of level 2 routines | CNugteren |
2015-09-18 | Added first version of banded matrix-vector multiplication | CNugteren |
2015-09-14 | Added xDOT/xDOTU/xDOTC dot-product routines | CNugteren |
2015-08-22 | Added the XSWAP, XSCAL and XCOPY level-1 routines | CNugteren |
2015-08-22 | Re-organized level1 xaxpy kernel | CNugteren |
2015-08-13 | Fixed a complex data-type bug in the transpose kernel | CNugteren |
2015-08-04 | Added distinguished names for GEMV inherited HEMV/SYMV | CNugteren |
2015-08-03 | Abstracted loading of matrix A for GEMV kernel | CNugteren |
2015-07-22 | Added workgroup shuffle option to transpose kernel for AMD GPUs | CNugteren |
2015-07-21 | Transpose kernel now uses vectorized local memory loads and stores | CNugteren |
2015-07-19 | Triangular GEMM kernels are only compiled when needed | CNugteren |
2015-07-19 | The kernel source string is now a routine's member variable | CNugteren |
2015-07-16 | Fixed a bug when using the Xgemm kernel without local memory | CNugteren |
2015-07-16 | Using mad() instruction for AMD devices like clBLAS does | CNugteren |
2015-07-12 | Added the HEMM routine, tester, and client | CNugteren |
2015-07-07 | Added option to set the imaginary part of the diagonal to zero | CNugteren |
2015-07-02 | Added the TRMM routine, tester, and client | CNugteren |
2015-07-02 | Added a set-to-one function for kernels | CNugteren |
2015-06-23 | Added a lower/upper triangular version of the GEMM kernel | CNugteren |
2015-06-23 | Added a condition to update only lower/upper triangular parts in the un-pad k... | CNugteren |
2015-06-16 | Added support for conjugate transpose in GEMV | CNugteren |
2015-06-16 | Added support for complex conjugate transpose | CNugteren |
2015-06-15 | Fixed a bug in AXPBY defines for complex data-types | CNugteren |
2015-06-14 | Split the three variations of the GEMV kernel for maximal tuning freedom | CNugteren |
2015-06-13 | Added a fast GEMV kernel with vector loads, no tail, and fewer if-statements | CNugteren |
2015-06-13 | Refactored the GEMV kernel | CNugteren |
2015-06-13 | Improved GEMV kernel with local memory and a tunable WPT | CNugteren |
2015-06-13 | Added initial version of GEMV including tester and performance client | CNugteren |
2015-06-10 | Added initial naive version of Xgemv kernel | CNugteren |
2015-05-30 | Initial commit of preview version | CNugteren |