index
:
debian-clblast
debian/sid
upstream/latest
Debian package for CLBlast.
gspr@nonempty.org
summary
refs
log
tree
commit
diff
log msg
author
committer
range
path:
root
/
src
/
utilities
Age
Commit message (
Expand
)
Author
2018-11-12
Add kernel_mode option to im2col, col2im, and convgemm functions
Koichi Akabe
2018-10-30
Fix col2im implementation
Koichi Akabe
2018-09-16
Merge branch 'master' into convgemm_multi_kernel
Cedric Nugteren
2018-09-15
Disabled Intel subgroup shuffling for double-precision
Cedric Nugteren
2018-07-29
Merge branch 'master' into CLBlast-267-convgemm
Cedric Nugteren
2018-07-23
Merge pull request #297 from tyler-utah/master
Cedric Nugteren
2018-07-14
Applied feedback from Cedric from first pull request
Tyler Sorensen
2018-07-13
Added device-name removal code to handle POCL naming convention
Cedric Nugteren
2018-07-11
added inline ptx to support shuffle on Nvidia GPUs
Tyler Sorensen
2018-06-03
Merge branch 'master' into CLBlast-267-convgemm
Cedric Nugteren
2018-05-23
Added an option in the clients to output timing statistics: minimum, mean, an...
Cedric Nugteren
2018-05-19
Merge branch 'master' into CLBlast-267-convgemm
Cedric Nugteren
2018-05-18
Merge branch 'master' into canary_buffer_overflow_protection
Cedric Nugteren
2018-05-17
Added a canary region for overflow detection to the tuners
Cedric Nugteren
2018-05-06
Added convgemm skeleton, test infrastructure, and first reference implementation
Cedric Nugteren
2018-05-01
Now stores a shared_ptr to the Program class in the cache
Cedric Nugteren
2018-04-24
Added a define to enable subgroup shuffling if supported by the device
Cedric Nugteren
2018-03-06
First version of the tuning API, added interface for copy-kernel, added sample
Cedric Nugteren
2018-02-11
Fixed a minor typo
Cedric Nugteren
2017-12-24
Fixes for the CUDA backend of CLBlast
Cedric Nugteren
2017-12-23
Added TRSV block-size tuner
Cedric Nugteren
2017-12-17
Removed all ARM Mali tuning results; re-added Mali-T760 and Mali-T628 results...
Cedric Nugteren
2017-12-10
Fixed a missing include
Cedric Nugteren
2017-12-09
Made the pre-processor run by default for ARM and Qualcomm GPUs
Cedric Nugteren
2017-11-30
Integrated pre-processor in compilation flow, default is still disabled
Cedric Nugteren
2017-11-25
Moved string splitting functions; added string character removal function
Cedric Nugteren
2017-11-22
Made parameter override in the clients a command-line argument and added supp...
Cedric Nugteren
2017-11-19
Added compilation timing and better compilation error reporting
Cedric Nugteren
2017-11-19
Revived the GEMM routine tuner; minor formatting changes
Cedric Nugteren
2017-11-17
Moved compilation function to separate file; removed dependency of tuners of ...
Cedric Nugteren
2017-11-15
Added first version of integrated and re-written auto-tuner
Cedric Nugteren
2017-11-15
Added kernel timing functionality to the utilities
Cedric Nugteren
2017-11-15
Added exception handle with catch-all
Cedric Nugteren
2017-11-13
Made the exception dispatch function optionally silent
Cedric Nugteren
2017-11-13
Moved square-difference utility function for use in the tuners
Cedric Nugteren
2017-11-07
Merge pull request #212 from CNugteren/kernel_selection_tuner
Cedric Nugteren
2017-11-02
Integrated the GEMM routine tuner for kernel selection; added first tuning re...
Cedric Nugteren
2017-10-30
Added collecting and printing of scores for the kernel-selection tuner
Cedric Nugteren
2017-10-29
Added Android support using the GNU C++ STL library and the GCC toolchain
Cedric Nugteren
2017-10-28
Merge branch 'master' into android_support
Cedric Nugteren
2017-10-28
Added initial version of a GEMM kernel selection tuner
Cedric Nugteren
2017-10-28
Moved timing function to a separate file
Cedric Nugteren
2017-10-15
Various fixes to make the first CUDA examples work
Cedric Nugteren
2017-10-12
CUDA API now takes context and device in instead of stream
Cedric Nugteren
2017-10-11
Added first (untested) version of a CUDA API
Cedric Nugteren
2017-10-09
Removed include of clpp11.hpp in places other than utilities.hpp
Cedric Nugteren
2017-10-08
Moved the remaining OpenCL specific host code to the clpp11.h header where it...
Cedric Nugteren
2017-10-07
Synchronizes clpp11.h with CLCudaAPI 9.0
Cedric Nugteren
2017-09-26
Added Android header for compilation with gnustl STL
Cedric Nugteren
2017-09-16
Fixed a compilation error and warning under MacOS
Cedric Nugteren
[next]