Age | Commit message (Expand) | Author |
2015-08-13 | Added all supported routines to the C API | CNugteren |
2015-08-13 | Added SGEMM example using the C API | CNugteren |
2015-08-13 | Added initial version of C API with just one routine | CNugteren |
2015-08-04 | Merge pull request #19 from CNugteren/basic_level2_routines | Cedric Nugteren |
2015-08-04 | Added distinguished names for GEMV inherited HEMV/SYMV | CNugteren |
2015-08-03 | Abstracted loading of matrix A for GEMV kernel | CNugteren |
2015-07-31 | Added HEMV and SYMV | CNugteren |
2015-07-31 | Added HEMV and SYMV | CNugteren |
2015-07-31 | Added HEMV routine | CNugteren |
2015-07-31 | Added SYMV routine | CNugteren |
2015-07-31 | Merge pull request #18 from CNugteren/correctness_test_refactoring | Cedric Nugteren |
2015-07-31 | Refactored the correctness tests | CNugteren |
2015-07-31 | Merge pull request #17 from CNugteren/clblas_external | Cedric Nugteren |
2015-07-31 | Updated documentation reflecting removal of clBLAS sources | CNugteren |
2015-07-31 | Removed clBLAS source code, now requires separate installation | CNugteren |
2015-07-27 | Moved the preferred options of clBLAS (no tests) to the CLBlast CMakeLists file | CNugteren |
2015-07-27 | Merge pull request #16 from CNugteren/claduc_header | Cedric Nugteren |
2015-07-27 | Now using the new Claduc C++11 OpenCL header | CNugteren |
2015-07-24 | Prepared the changelog for the next release | CNugteren |
2015-07-24 | Updated to version 0.3.0 | CNugteren |
2015-07-24 | Merge pull request #14 from CNugteren/amd_performance | Cedric Nugteren |
2015-07-24 | Updated the docs to reflect the performance improvements | CNugteren |
2015-07-23 | Updated the performance results, added HD7950 | CNugteren |
2015-07-22 | Made the graph script robust against diagnostic system messages | CNugteren |
2015-07-22 | Set the correct name for AMD OpenCL devices | CNugteren |
2015-07-22 | Updated GEMM tuning results for Tahiti | CNugteren |
2015-07-22 | Added workgroup shuffle option to transpose kernel for AMD GPUs | CNugteren |
2015-07-21 | Transpose kernel now uses vectorized local memory loads and stores | CNugteren |
2015-07-19 | Triangular GEMM kernels are only compiled when needed | CNugteren |
2015-07-19 | Kernel caching is now based on a routine's name | CNugteren |
2015-07-19 | The kernel source string is now a routine's member variable | CNugteren |
2015-07-19 | Fixed complex performance on Intel Iris | CNugteren |
2015-07-16 | Fixed a bug when using the Xgemm kernel without local memory | CNugteren |
2015-07-16 | Using mad() instruction for AMD devices like clBLAS does | CNugteren |
2015-07-15 | Merge pull request #13 from CNugteren/bypass_pre_post_processing | Cedric Nugteren |
2015-07-15 | Updated changelog with pre/post-processing bypass | CNugteren |
2015-07-15 | Changed performance graphs to default to column-major | CNugteren |
2015-07-15 | Skips pre/post processing kernels if not needed | CNugteren |
2015-07-13 | Updated interface of the PadCopyTransposeMatrix method | CNugteren |
2015-07-12 | Merge pull request #12 from CNugteren/level_subfolders | Cedric Nugteren |
2015-07-12 | Added subfolders for the level1/2/3 routines | CNugteren |
2015-07-12 | Merge pull request #11 from CNugteren/level3_routines_2 | Cedric Nugteren |
2015-07-12 | Added HEMM, HERK, HER2K, and TRMM | CNugteren |
2015-07-12 | Added the HEMM routine, tester, and client | CNugteren |
2015-07-10 | Disabled prototype of TRSM | CNugteren |
2015-07-10 | Added the HER2K routine, tester, and client | CNugteren |
2015-07-10 | Added the HERK routine, tester, and client | CNugteren |
2015-07-10 | The clients now distinguish between the memory and alpha/beta data-type | CNugteren |
2015-07-08 | Added option to set the imaginary part of the diagonal to zero | CNugteren |
2015-07-08 | The testers now distinguish between the memory and alpha/beta data-type | CNugteren |