index
:
debian-clblast
debian/sid
upstream/latest
Debian package for CLBlast.
gspr@nonempty.org
summary
refs
log
tree
commit
diff
log msg
author
committer
range
path:
root
/
src
/
routines
Age
Commit message (
Expand
)
Author
2019-01-19
Merge pull request #345 from CNugteren/convolution-fixes-and-tuner
Cedric Nugteren
2019-01-05
Added a check to prevent the stride of matrix C being set to 0 for the stride...
Cedric Nugteren
2018-12-31
Added convgemm to the CLBlast database, added initial parameters for Skylake GPU
Cedric Nugteren
2018-12-18
Fix xconvgemm kernel and enable ConvGemmMethod::kSingleKernel
Koichi Akabe
2018-11-30
Fixed an issue for unequal MWG and NWG and the new GEMMK == 1 kernel
Cedric Nugteren
2018-11-12
Add kernel_mode option to im2col, col2im, and convgemm functions
Koichi Akabe
2018-10-30
Fix col2im implementation
Koichi Akabe
2018-10-23
Added groundwork for col2im algorithm plus first non-working version of kerne...
Cedric Nugteren
2018-10-22
Some name changes in im2col code
Cedric Nugteren
2018-09-16
Merge branch 'master' into convgemm_multi_kernel
Cedric Nugteren
2018-09-15
Fixed an MSVC compilation error due to large strings
Cedric Nugteren
2018-09-07
Added xCONVGEMM as im2col plus a batched GEMM kernel
Cedric Nugteren
2018-08-13
Made last operation in TRSV and TRSM asynchronous, making the events not null
Cedric Nugteren
2018-08-13
Small refactoring of events in TRSV substitution routine
Cedric Nugteren
2018-07-31
Fixed issue with not performing complex conjugation under certain cases when ...
Cedric Nugteren
2018-06-03
Merge branch 'master' into CLBlast-267-convgemm
Cedric Nugteren
2018-06-01
Fixes for Apple OpenCL CPU implementation which requires a LWGS of 1 when bar...
Cedric Nugteren
2018-05-31
Added error-checking for half-empty local work group sizes; fixed a minor TRS...
Cedric Nugteren
2018-05-31
Some potential fixes for error -54 when launching TRSV and TRSM kernels
Cedric Nugteren
2018-05-30
Widened Apple OpenCL check, added way to debug too-large-workgroups issue
Cedric Nugteren
2018-05-27
Added a check to return 'NotImplemented' error code in case of systems with <...
Cedric Nugteren
2018-05-27
Made FillMatrix and FillVector functions take a configurable local workgroup ...
Cedric Nugteren
2018-05-21
Added method selection option to switch between im2col and single-kernel appr...
Cedric Nugteren
2018-05-19
Moved new convgemm kernel to levelx kernel folder
Cedric Nugteren
2018-05-19
Second version of direct reading from image tensor for convgemm: also with lo...
Cedric Nugteren
2018-05-19
Merge branch 'master' into CLBlast-267-convgemm
Cedric Nugteren
2018-05-17
First version of direct reading from image tensor for convgemm: only for edge...
Cedric Nugteren
2018-05-13
Created a dedicated convgemm GEMM kernel as a copy of the batched direct gemm...
Cedric Nugteren
2018-05-13
Plugged in the code of strided-batched-gemm into convgemm in preparation of a...
Cedric Nugteren
2018-05-09
Changed temporary convgemm implementation to use batched-strided GEMM
Cedric Nugteren
2018-05-09
Implemented convolution as im2col + GEMM
Cedric Nugteren
2018-05-06
Added convgemm skeleton, test infrastructure, and first reference implementation
Cedric Nugteren
2018-05-01
Now stores a shared_ptr to the Program class in the cache
Cedric Nugteren
2018-04-18
Expressed HER2K as two HERK calls
Cedric Nugteren
2018-04-18
Expressed SYR2K as two SYRK calls
Cedric Nugteren
2018-04-17
Updated HERK and SYRK to follow the GEMM style and functions to make it work ...
Cedric Nugteren
2018-04-15
Fixed some failing tests for GEMM and batched GEMM routines
Cedric Nugteren
2018-04-13
Made GEMM rotation expectations kernel-specific
Cedric Nugteren
2018-03-15
Fixed a failing TRSM test using a CPU with Apple OpenCL
Cedric Nugteren
2018-03-15
Fixed a failing TRSV test using a CPU with Apple OpenCL
Cedric Nugteren
2018-02-02
Implemented the XHAD Hadamard product routine
Cedric Nugteren
2018-01-31
Created the API and stubs for the HAD (hadamard-product) routines
Cedric Nugteren
2018-01-26
Fixed an event synchronisation issue in the batched gemm routines
Cedric Nugteren
2018-01-18
Made the batched routines also chose direct/indirect kernel like the main GEM...
Cedric Nugteren
2018-01-08
Implemented the in-direct version of the strided-batched GEMM kernel
Cedric Nugteren
2018-01-07
Implemented direct version of strided-batched GEMM kernel
Cedric Nugteren
2018-01-07
Added API and tests for new GemmStridedBatched routine
Cedric Nugteren
2018-01-06
Reduced duplicate code in the batched GEMM implementation
Cedric Nugteren
2018-01-06
Fixed the CUDA interface: replaced nullptr with 0
Cedric Nugteren
2017-12-30
Added optional temp-buffer argument to C++ interface of GEMM
Cedric Nugteren
[next]