Age | Commit message (Expand) | Author |
2016-10-25 | Sets the proper sizes for the buffers for the Netlib CBLAS API | Cedric Nugteren |
2016-10-25 | Added initial version of a Netlib CBLAS implementation. TODO: Set correct buf... | Cedric Nugteren |
2016-10-25 | Merge branch 'development' into netlib_blas_api | Cedric Nugteren |
2016-10-22 | All enums in the C API are now prefixed with CLBlast to avoid potential name ... | Cedric Nugteren |
2016-10-22 | Added extra error codes to reflect the more detailed error reporting of OpenC... | Cedric Nugteren |
2016-10-22 | Routine: get rid of ::SetUp() | Ivan Shapovalov |
2016-10-22 | treewide: use C++ exceptions properly | Ivan Shapovalov |
2016-10-16 | Merge branch 'development' into netlib_blas_api | Cedric Nugteren |
2016-10-14 | Fixed an issue with a growing database: the database is now a global variable... | Cedric Nugteren |
2016-10-10 | Changed the storage location of the database to a separate Github repository | Cedric Nugteren |
2016-10-10 | Added fresh performance graphs for GeForce 750Ti; removed old GTX480 results | Cedric Nugteren |
2016-10-08 | Added benchmark script for small matrix sizes, testing the direct GEMM kernels | Cedric Nugteren |
2016-10-05 | Made non-standard types void-pointers in the Netlib BLAS interface | Cedric Nugteren |
2016-10-05 | Added first version of Netlib BLAS API header | Cedric Nugteren |
2016-09-12 | Added XgemvFastRot and Xgemm 16-bit tuning results: just defaults which are n... | Cedric Nugteren |
2016-09-11 | Complete re-write of the database script. Changed Pandas for the much faster ... | Cedric Nugteren |
2016-09-10 | Updated database based on exhaustive tuning results for GEMM for the R9 M370X... | Cedric Nugteren |
2016-09-10 | Updated the database script to remove duplicate entries: keeps only the best-... | Cedric Nugteren |
2016-09-04 | Refactored the Python C++ generator script; now confirms to the PEP8 styleguide | Cedric Nugteren |
2016-09-03 | Added tuning results for Intel Broadwell 5500 GT2 GPU | Cedric Nugteren |
2016-09-03 | Updated tuning results for Haswell GT2 Mobile GPU; fixed database script to h... | Cedric Nugteren |
2016-08-21 | Also changed the default-default for unknown device types to use the same met... | Cedric Nugteren |
2016-08-21 | Updated the changelog; refactored the database-get-bests code a bit | Cedric Nugteren |
2016-08-15 | Updated the database script to calculate the relative best performance of tun... | Cedric Nugteren |
2016-08-09 | Improved the speed of the new common-best defaults method for the database ge... | Cedric Nugteren |
2016-08-07 | Added a first version of the database's common-best default calculation | Cedric Nugteren |
2016-07-25 | Moved the XgemvFast and XgemvFastRot tuning database into a separate file | Cedric Nugteren |
2016-07-24 | Refactored the Python database script: separated functionality in modules, no... | Cedric Nugteren |
2016-07-03 | Added tuning results for GTX670, GTX750, and GTX1070 (thanks to gcp) | Cedric Nugteren |
2016-07-02 | Prints the current pandas version and reports the minimum required version | Cedric Nugteren |
2016-06-30 | Added declspec(dllexport) to ClearCache and FillCache, and added declspec(dll... | Cedric Nugteren |
2016-06-27 | Moved the performance graph scripts to the 'scripts' subfolder | Cedric Nugteren |
2016-06-19 | Minor fix to the database script | Cedric Nugteren |
2016-06-19 | Renamed all C++ source files to .cpp to match the .hpp extension better | Cedric Nugteren |
2016-06-18 | Moved all headers into the source tree, changed headers to .hpp extension | Cedric Nugteren |
2016-06-18 | Clean-up of the routine class, moved RunKernel to the routine/common file | Cedric Nugteren |
2016-06-16 | Added XOMATCOPY routines to perform out-of-place matrix scaling, copying, and... | Cedric Nugteren |
2016-06-13 | Improved API documentation and added documentation for level-2 and level-3 ro... | Cedric Nugteren |
2016-06-10 | Added documentation for the matrix-update level-2 family of routines | Cedric Nugteren |
2016-06-02 | Added return value to the test binaries (0: success, 1: failure), allowing it... | Cedric Nugteren |
2016-05-26 | Added half-precision tests for the clBLAS reference through conversion to sin... | Cedric Nugteren |
2016-05-26 | Added half-precision tests for the CBLAS reference through conversion to sing... | Cedric Nugteren |
2016-05-25 | Added possibility to run the performance client with half-precision | Cedric Nugteren |
2016-05-25 | Added level-3 half-precision routines HGEMM/HSYMM/HSYRK/HSYR2K/HTRMM | Cedric Nugteren |
2016-05-22 | Added level-2 half-precision routines HGER/HSYR/HSPR/HSYR2/HSPR2 | Cedric Nugteren |
2016-05-22 | Added level-2 half-precision routines HGEMV/HGBMV/HHEMV/HHBMV/HHPMV/HSYMV/HSB... | Cedric Nugteren |
2016-05-22 | Added level-1 half-precision routines HSWAP/HSCAL/HCOPY/HAXPY/HDOT/HNRM2/HASU... | Cedric Nugteren |
2016-05-18 | Merged in latest changes from 0.7.1 release | Cedric Nugteren |
2016-05-13 | Initial experimental version of the half-precision HAXPY routine | Cedric Nugteren |
2016-05-12 | Initial changes in preparation for half-precision fp16 support | Cedric Nugteren |