Age | Commit message (Expand) | Author |
2018-12-17 | Fix half-float+kernel_mode test cases of im2col, col2im, and convgemm | Koichi Akabe |
2018-11-12 | Add kernel_mode option to im2col, col2im, and convgemm functions | Koichi Akabe |
2018-11-07 | Changed col2im to append to the existing im-buffer | Cedric Nugteren |
2018-11-01 | Fixed half-precision tests for im2col and col2im | Cedric Nugteren |
2018-10-30 | Fix col2im implementation | Koichi Akabe |
2018-10-23 | Added groundwork for col2im algorithm plus first non-working version of kerne... | Cedric Nugteren |
2018-10-22 | Some name changes in im2col code | Cedric Nugteren |
2018-10-17 | Fixed MSVC's compilation error C1061 due to too many for-loops | Cedric Nugteren |
2018-09-16 | Merge branch 'master' into convgemm_multi_kernel | Cedric Nugteren |
2018-09-15 | Added pre-processor test for GEMMK=1 kernel | Cedric Nugteren |
2018-09-07 | Reduced size of the xCONVGEMM correctness tests | Cedric Nugteren |
2018-09-07 | Added reference implementation for xCONVGEMM for half-precision | Cedric Nugteren |
2018-07-31 | Fixed the tests of OMATCOPY to include proper complex conjugation | Cedric Nugteren |
2018-07-31 | Fixed an error reporting issue related to the canary region | Cedric Nugteren |
2018-07-29 | Removed complex numbers support for CONVGEMM | Cedric Nugteren |
2018-06-03 | Merge branch 'master' into CLBlast-267-convgemm | Cedric Nugteren |
2018-06-02 | Added MKL as an alternative for CBLAS for correctness and performance compari... | Cedric Nugteren |
2018-05-27 | Added maximum time reporting to the client statistics | Cedric Nugteren |
2018-05-23 | Added an option in the clients to output timing statistics: minimum, mean, an... | Cedric Nugteren |
2018-05-21 | Further implemented single-kernel approach of convgemm; extended test to capt... | Cedric Nugteren |
2018-05-19 | Merge branch 'master' into CLBlast-267-convgemm | Cedric Nugteren |
2018-05-19 | Fixed a bug in loading xgemm-direct JSON data from disk | Cedric Nugteren |
2018-05-17 | Fixed a few issues with canary region testing | Cedric Nugteren |
2018-05-17 | Added a canary region for overflow detection to the correctness tests | Cedric Nugteren |
2018-05-09 | Fixed the performance client for convgemm and added GFLOPS measurements | Cedric Nugteren |
2018-05-09 | Split channels/strides testing values off from kernel sizes for more flexibility | Cedric Nugteren |
2018-05-06 | Added convgemm skeleton, test infrastructure, and first reference implementation | Cedric Nugteren |
2018-04-15 | Fixed some failing tests for GEMM and batched GEMM routines | Cedric Nugteren |
2018-03-15 | Fixed breaking preprocessor test on certain platforms due to empty kernel string | Cedric Nugteren |
2018-02-02 | Implemented the XHAD Hadamard product routine | Cedric Nugteren |
2018-01-31 | Created the API and stubs for the HAD (hadamard-product) routines | Cedric Nugteren |
2018-01-14 | Small improvements to benchmarking for cuBLAS | Cedric Nugteren |
2018-01-11 | Added test for the RetrieveParameters function | Cedric Nugteren |
2018-01-11 | Fixed bug in override parameters test | Cedric Nugteren |
2018-01-07 | Added API and tests for new GemmStridedBatched routine | Cedric Nugteren |
2018-01-06 | Prevented half-precision batched routines from failing in the tests | Cedric Nugteren |
2018-01-06 | Added CUDA interface to get temporary-buffer size for GEMM routine | Cedric Nugteren |
2018-01-03 | Added the temp-buffer to the GEMM testers and clients | Cedric Nugteren |
2018-01-03 | Added a queue argument to the get-size function when running the tests/clients | Cedric Nugteren |
2017-12-24 | Fixes for the CUDA backend of CLBlast | Cedric Nugteren |
2017-12-23 | Fixed unused variable warnings showing up with Clang | Cedric Nugteren |
2017-12-10 | Split GEMM kernel in 4 files instead of 3 due to MSVC 2013 string length limit | Cedric Nugteren |
2017-12-09 | Completed kernel modifications for pre-processor of all other kernels | Cedric Nugteren |
2017-12-09 | Made the pre-processor run by default for ARM and Qualcomm GPUs | Cedric Nugteren |
2017-12-09 | Fixed defines parsing and substituting in pre-processor; fixed some variable ... | Cedric Nugteren |
2017-12-05 | Improved array-to-register promotion, now handling function calls as well | Cedric Nugteren |
2017-12-03 | Added GEMM (direct and in-direct) to the pre-processor testing; modified the ... | Cedric Nugteren |
2017-12-03 | Reformated transpose kernels for the pre-processor; extended the amount of tests | Cedric Nugteren |
2017-11-30 | Improved the pre-processor's handling of defines; added a special nested defi... | Cedric Nugteren |
2017-11-30 | Integrated pre-processor in compilation flow, default is still disabled | Cedric Nugteren |