index
:
debian-clblast
debian/sid
upstream/latest
Debian package for CLBlast.
gspr@nonempty.org
summary
refs
log
tree
commit
diff
log msg
author
committer
range
path:
root
/
src
/
kernels
Age
Commit message (
Expand
)
Author
2019-09-04
Fix out-of-bounds read/write in XhadFaster
etomzak
2019-05-19
Fixed a bug in the absolute-min index kernel
Cedric Nugteren
2019-05-08
Changed back to cl_intel_subgroups as suggested
Cedric Nugteren
2019-05-07
Enabled avc_motion_estimation extension for Intel subgroup shuffling
Cedric Nugteren
2018-12-18
Fix xconvgemm kernel and enable ConvGemmMethod::kSingleKernel
Koichi Akabe
2018-11-19
Remove unnecessary attribute of inline function
Koichi Akabe
2018-11-12
Add kernel_mode option to im2col, col2im, and convgemm functions
Koichi Akabe
2018-11-07
Changed col2im to append to the existing im-buffer
Cedric Nugteren
2018-11-01
Added new col2im routine to the documentation
Cedric Nugteren
2018-10-30
Fix col2im implementation
Koichi Akabe
2018-10-23
Added groundwork for col2im algorithm plus first non-working version of kerne...
Cedric Nugteren
2018-10-17
Fixed a bug with the pre-processing and the AXPY kernel
Cedric Nugteren
2018-10-15
Fixed a bug in the XaxpyFaster kernel for specific parameters
Cedric Nugteren
2018-10-14
Merge pull request #319 from CNugteren/convgemm_multi_kernel
Cedric Nugteren
2018-10-10
Fixed pre-processor warnings related to the subgroup shuffling
Cedric Nugteren
2018-09-16
Merge branch 'master' into convgemm_multi_kernel
Cedric Nugteren
2018-09-15
Fixed an MSVC compilation error due to large strings
Cedric Nugteren
2018-09-15
Fixed issues with GEMMK=1 kernel and the pre-processor
Cedric Nugteren
2018-09-07
Added xCONVGEMM as im2col plus a batched GEMM kernel
Cedric Nugteren
2018-07-29
Merge branch 'master' into CLBlast-267-convgemm
Cedric Nugteren
2018-07-28
Disabled the use of staggered indices on AMD GPUs for the new GEMMK == 1 kern...
Cedric Nugteren
2018-07-27
Fixed an issue with AMD GPUs and the new GEMMK == 1 kernel
Cedric Nugteren
2018-07-16
moved a two-line macro to a single line
Tyler Sorensen
2018-07-14
Applied feedback from Cedric from first pull request
Tyler Sorensen
2018-07-11
added inline ptx to support shuffle on Nvidia GPUs
Tyler Sorensen
2018-06-03
Merge branch 'master' into CLBlast-267-convgemm
Cedric Nugteren
2018-05-31
Some potential fixes for error -54 when launching TRSV and TRSM kernels
Cedric Nugteren
2018-05-21
Further implemented single-kernel approach of convgemm; extended test to capt...
Cedric Nugteren
2018-05-21
Added method selection option to switch between im2col and single-kernel appr...
Cedric Nugteren
2018-05-19
Moved new convgemm kernel to levelx kernel folder
Cedric Nugteren
2018-05-19
Second version of direct reading from image tensor for convgemm: also with lo...
Cedric Nugteren
2018-05-17
First version of direct reading from image tensor for convgemm: only for edge...
Cedric Nugteren
2018-05-13
Created a dedicated convgemm GEMM kernel as a copy of the batched direct gemm...
Cedric Nugteren
2018-05-13
Plugged in the code of strided-batched-gemm into convgemm in preparation of a...
Cedric Nugteren
2018-04-24
Added Intel subgroup shuffle support to the 2D register caching GEMM kernel
Cedric Nugteren
2018-04-08
Fixed issues with the pre-processor
Cedric Nugteren
2018-04-07
Extended the GEMM tuner to be able to tune the new 'kernel 1'
Cedric Nugteren
2018-04-07
Fixed a compilation issue for complex datatypes and vload
Cedric Nugteren
2018-04-06
Fixed a compilation issue for complex datatypes and vload
Cedric Nugteren
2018-04-03
Added first version of 2D register tiling kernel with A and C transposed as well
Cedric Nugteren
2018-03-23
Removed arrays as function argument from GEMM kernels for Vivante OpenCL comp...
Cedric Nugteren
2018-03-15
Fixed a failing TRSM test using a CPU with Apple OpenCL
Cedric Nugteren
2018-03-15
Fixed a failing TRSV test using a CPU with Apple OpenCL
Cedric Nugteren
2018-02-02
Implemented the XHAD Hadamard product routine
Cedric Nugteren
2018-01-08
Implemented the in-direct version of the strided-batched GEMM kernel
Cedric Nugteren
2018-01-07
Implemented direct version of strided-batched GEMM kernel
Cedric Nugteren
2017-12-31
Revert "Added options to disable parts of the invert kernel to find out where...
Cedric Nugteren
2017-12-31
Changed the invert kernel slightly; added part1a/part1b disable-defines
Cedric Nugteren
2017-12-30
Fixed ifdef's into ifndef's
Cedric Nugteren
2017-12-30
Added options to disable parts of the invert kernel to find out where the AMD...
Cedric Nugteren
[next]