summaryrefslogtreecommitdiff
AgeCommit message (Expand)Author
2015-07-22Made the graph script robust against diagnostic system messagesCNugteren
2015-07-22Set the correct name for AMD OpenCL devicesCNugteren
2015-07-22Updated GEMM tuning results for TahitiCNugteren
2015-07-22Added workgroup shuffle option to transpose kernel for AMD GPUsCNugteren
2015-07-21Transpose kernel now uses vectorized local memory loads and storesCNugteren
2015-07-19Triangular GEMM kernels are only compiled when neededCNugteren
2015-07-19Kernel caching is now based on a routine's nameCNugteren
2015-07-19The kernel source string is now a routine's member variableCNugteren
2015-07-19Fixed complex performance on Intel IrisCNugteren
2015-07-16Fixed a bug when using the Xgemm kernel without local memoryCNugteren
2015-07-16Using mad() instruction for AMD devices like clBLAS doesCNugteren
2015-07-15Merge pull request #13 from CNugteren/bypass_pre_post_processingCedric Nugteren
2015-07-15Updated changelog with pre/post-processing bypassCNugteren
2015-07-15Changed performance graphs to default to column-majorCNugteren
2015-07-15Skips pre/post processing kernels if not neededCNugteren
2015-07-13Updated interface of the PadCopyTransposeMatrix methodCNugteren
2015-07-12Merge pull request #12 from CNugteren/level_subfoldersCedric Nugteren
2015-07-12Added subfolders for the level1/2/3 routinesCNugteren
2015-07-12Merge pull request #11 from CNugteren/level3_routines_2Cedric Nugteren
2015-07-12Added HEMM, HERK, HER2K, and TRMMCNugteren
2015-07-12Added the HEMM routine, tester, and clientCNugteren
2015-07-10Disabled prototype of TRSMCNugteren
2015-07-10Added the HER2K routine, tester, and clientCNugteren
2015-07-10Added the HERK routine, tester, and clientCNugteren
2015-07-10The clients now distinguish between the memory and alpha/beta data-typeCNugteren
2015-07-08Added option to set the imaginary part of the diagonal to zeroCNugteren
2015-07-08The testers now distinguish between the memory and alpha/beta data-typeCNugteren
2015-07-07Added option to set the imaginary part of the diagonal to zeroCNugteren
2015-07-02Added the TRMM routine, tester, and clientCNugteren
2015-07-02Fixed the order of argumentsCNugteren
2015-07-02Added a set-to-one function for kernelsCNugteren
2015-07-01Added the unit/non-unit diagonal enumCNugteren
2015-07-01Fixed typos in SYMMCNugteren
2015-06-30Added the TRMM and TRSM interfaceCNugteren
2015-06-30Added constness to all cl_mem objectsCNugteren
2015-06-30Added TRMM and TRSM clBLAS wrappersCNugteren
2015-06-29Merge pull request #10 from CNugteren/test_infrastructureCedric Nugteren
2015-06-29Re-organized test and client infrastructureCNugteren
2015-06-29Fixed the license for the correctness testersCNugteren
2015-06-29Re-organized the performance-client infrastructure to avoid code duplicationCNugteren
2015-06-28Re-organized the test infrastructure to avoid code duplicationCNugteren
2015-06-28Added buffer structure and sizes to argumentsCNugteren
2015-06-26Merge pull request #9 from CNugteren/level3_routinesCedric Nugteren
2015-06-26Replaced crosses with tickmarksCNugteren
2015-06-26Added the SYR2K routine, tester, and clientCNugteren
2015-06-26Added symmetric matrix support for the ABC performance testerCNugteren
2015-06-25Added option to test only symmetric matrices (m=n)CNugteren
2015-06-25Clarified commentCNugteren
2015-06-25Added SSYRK performance graphsCNugteren
2015-06-24Added the SYRK routineCNugteren