index
:
debian-clblast
debian/sid
upstream/latest
Debian package for CLBlast.
gspr@nonempty.org
summary
refs
log
tree
commit
diff
log msg
author
committer
range
Age
Commit message (
Expand
)
Author
2018-05-19
Merge branch 'master' into CLBlast-267-convgemm
Cedric Nugteren
2018-05-19
Added an option to run the routine tuner for a single specific GEMM size
Cedric Nugteren
2018-05-19
Merge pull request #284 from CNugteren/routine_tuners_read_kernel_json_from_disk
Cedric Nugteren
2018-05-19
Fixed compilation issues
Cedric Nugteren
2018-05-19
The GEMM routine tuner now loads kernel JSON tuning results from disk if avai...
Cedric Nugteren
2018-05-19
Fixed a bug in loading xgemm-direct JSON data from disk
Cedric Nugteren
2018-05-18
Merge pull request #283 from CNugteren/canary_buffer_overflow_protection
Cedric Nugteren
2018-05-18
Merge branch 'master' into canary_buffer_overflow_protection
Cedric Nugteren
2018-05-17
Merge pull request #282 from CNugteren/CLBlast-276-program-release-improvements
Cedric Nugteren
2018-05-17
Updated the roadmap
Cedric Nugteren
2018-05-17
Updated README with IWOCL talk and GPU zoo acknowledgment
Cedric Nugteren
2018-05-17
Added documentation on some details of the GEMM implementation
Cedric Nugteren
2018-05-17
Fixed a few issues with canary region testing
Cedric Nugteren
2018-05-17
Added a canary region for overflow detection to the correctness tests
Cedric Nugteren
2018-05-17
Added a canary region for overflow detection to the tuners
Cedric Nugteren
2018-05-17
First version of direct reading from image tensor for convgemm: only for edge...
Cedric Nugteren
2018-05-13
Created a dedicated convgemm GEMM kernel as a copy of the batched direct gemm...
Cedric Nugteren
2018-05-13
Plugged in the code of strided-batched-gemm into convgemm in preparation of a...
Cedric Nugteren
2018-05-09
Changed temporary convgemm implementation to use batched-strided GEMM
Cedric Nugteren
2018-05-09
Fixed the performance client for convgemm and added GFLOPS measurements
Cedric Nugteren
2018-05-09
Merge pull request #279 from umar456/ci_links
Cedric Nugteren
2018-05-09
Updated the documentation for convgemm to include data layout (NCHW)
Cedric Nugteren
2018-05-09
Implemented convolution as im2col + GEMM
Cedric Nugteren
2018-05-09
Split channels/strides testing values off from kernel sizes for more flexibility
Cedric Nugteren
2018-05-08
Update ci links to use doman names and build names instead of IP/id
Umar Arshad
2018-05-06
Added convgemm skeleton, test infrastructure, and first reference implementation
Cedric Nugteren
2018-05-05
Added interface of batched convolution as GEMM
Cedric Nugteren
2018-05-01
Updated README with new badges and paper citation
Cedric Nugteren
2018-05-01
Now stores a shared_ptr to the Program class in the cache
Cedric Nugteren
2018-04-29
Merge pull request #277 from CNugteren/CLBlast-257-intel-subgroups
Cedric Nugteren
2018-04-29
Updated the changelog
Cedric Nugteren
2018-04-29
Updated the roadmap
Cedric Nugteren
2018-04-26
Fixed an access violation when compiled with Visual Studio upon releasing the...
Cedric Nugteren
2018-04-24
Added Intel subgroup shuffle support to the 2D register caching GEMM kernel
Cedric Nugteren
2018-04-24
Added a define to enable subgroup shuffling if supported by the device
Cedric Nugteren
2018-04-21
Merge pull request #274 from CNugteren/CLBlast-228-2d-register-gemm-kernel
Cedric Nugteren
2018-04-20
Fixes for the CUDA API
Cedric Nugteren
2018-04-18
Expressed HER2K as two HERK calls
Cedric Nugteren
2018-04-18
Expressed SYR2K as two SYRK calls
Cedric Nugteren
2018-04-17
Updated HERK and SYRK to follow the GEMM style and functions to make it work ...
Cedric Nugteren
2018-04-15
Fixed some failing tests for GEMM and batched GEMM routines
Cedric Nugteren
2018-04-15
Updated tuning results for the Skylake ULT GT2 GPU with the new kernel
Cedric Nugteren
2018-04-13
Made GEMM rotation expectations kernel-specific
Cedric Nugteren
2018-04-10
Updated database with defaults of GEMMK=0 and KREG=1
Cedric Nugteren
2018-04-10
Made it possible to add tuning parameters to the database using the script
Cedric Nugteren
2018-04-10
Fixed a bug in the compression part of the database script
Cedric Nugteren
2018-04-08
Extended the maximum number of tuning parameters from 14 to 16
Cedric Nugteren
2018-04-08
Fixed issues with the pre-processor
Cedric Nugteren
2018-04-07
Merge branch 'master' into CLBlast-228-2d-register-gemm-kernel
Cedric Nugteren
2018-04-07
Added tuning results for NVIDIA GeForce 970
Cedric Nugteren
[prev]
[next]