summaryrefslogtreecommitdiff
path: root/src/routines
AgeCommit message (Expand)Author
2019-12-09Reduce TestMatrix calls for xgemmstridedbatched.Tarmo Räntilä
2019-12-09Reduce TestMatrix calls for xgemmbatched.Tarmo Räntilä
2019-01-19Merge pull request #345 from CNugteren/convolution-fixes-and-tunerCedric Nugteren
2019-01-05Added a check to prevent the stride of matrix C being set to 0 for the stride...Cedric Nugteren
2018-12-31Added convgemm to the CLBlast database, added initial parameters for Skylake GPUCedric Nugteren
2018-12-18Fix xconvgemm kernel and enable ConvGemmMethod::kSingleKernelKoichi Akabe
2018-11-30Fixed an issue for unequal MWG and NWG and the new GEMMK == 1 kernelCedric Nugteren
2018-11-12Add kernel_mode option to im2col, col2im, and convgemm functionsKoichi Akabe
2018-10-30Fix col2im implementationKoichi Akabe
2018-10-23Added groundwork for col2im algorithm plus first non-working version of kerne...Cedric Nugteren
2018-10-22Some name changes in im2col codeCedric Nugteren
2018-09-16Merge branch 'master' into convgemm_multi_kernelCedric Nugteren
2018-09-15Fixed an MSVC compilation error due to large stringsCedric Nugteren
2018-09-07Added xCONVGEMM as im2col plus a batched GEMM kernelCedric Nugteren
2018-08-13Made last operation in TRSV and TRSM asynchronous, making the events not nullCedric Nugteren
2018-08-13Small refactoring of events in TRSV substitution routineCedric Nugteren
2018-07-31Fixed issue with not performing complex conjugation under certain cases when ...Cedric Nugteren
2018-06-03Merge branch 'master' into CLBlast-267-convgemmCedric Nugteren
2018-06-01Fixes for Apple OpenCL CPU implementation which requires a LWGS of 1 when bar...Cedric Nugteren
2018-05-31Added error-checking for half-empty local work group sizes; fixed a minor TRS...Cedric Nugteren
2018-05-31Some potential fixes for error -54 when launching TRSV and TRSM kernelsCedric Nugteren
2018-05-30Widened Apple OpenCL check, added way to debug too-large-workgroups issueCedric Nugteren
2018-05-27Added a check to return 'NotImplemented' error code in case of systems with <...Cedric Nugteren
2018-05-27Made FillMatrix and FillVector functions take a configurable local workgroup ...Cedric Nugteren
2018-05-21Added method selection option to switch between im2col and single-kernel appr...Cedric Nugteren
2018-05-19Moved new convgemm kernel to levelx kernel folderCedric Nugteren
2018-05-19Second version of direct reading from image tensor for convgemm: also with lo...Cedric Nugteren
2018-05-19Merge branch 'master' into CLBlast-267-convgemmCedric Nugteren
2018-05-17First version of direct reading from image tensor for convgemm: only for edge...Cedric Nugteren
2018-05-13Created a dedicated convgemm GEMM kernel as a copy of the batched direct gemm...Cedric Nugteren
2018-05-13Plugged in the code of strided-batched-gemm into convgemm in preparation of a...Cedric Nugteren
2018-05-09Changed temporary convgemm implementation to use batched-strided GEMMCedric Nugteren
2018-05-09Implemented convolution as im2col + GEMMCedric Nugteren
2018-05-06Added convgemm skeleton, test infrastructure, and first reference implementationCedric Nugteren
2018-05-01Now stores a shared_ptr to the Program class in the cacheCedric Nugteren
2018-04-18Expressed HER2K as two HERK callsCedric Nugteren
2018-04-18Expressed SYR2K as two SYRK callsCedric Nugteren
2018-04-17Updated HERK and SYRK to follow the GEMM style and functions to make it work ...Cedric Nugteren
2018-04-15Fixed some failing tests for GEMM and batched GEMM routinesCedric Nugteren
2018-04-13Made GEMM rotation expectations kernel-specificCedric Nugteren
2018-03-15Fixed a failing TRSM test using a CPU with Apple OpenCLCedric Nugteren
2018-03-15Fixed a failing TRSV test using a CPU with Apple OpenCLCedric Nugteren
2018-02-02Implemented the XHAD Hadamard product routineCedric Nugteren
2018-01-31Created the API and stubs for the HAD (hadamard-product) routinesCedric Nugteren
2018-01-26Fixed an event synchronisation issue in the batched gemm routinesCedric Nugteren
2018-01-18Made the batched routines also chose direct/indirect kernel like the main GEM...Cedric Nugteren
2018-01-08Implemented the in-direct version of the strided-batched GEMM kernelCedric Nugteren
2018-01-07Implemented direct version of strided-batched GEMM kernelCedric Nugteren
2018-01-07Added API and tests for new GemmStridedBatched routineCedric Nugteren
2018-01-06Reduced duplicate code in the batched GEMM implementationCedric Nugteren