index
:
debian-clblast
debian/sid
upstream/latest
Debian package for CLBlast.
gspr@nonempty.org
summary
refs
log
tree
commit
diff
log msg
author
committer
range
path:
root
/
src
Age
Commit message (
Expand
)
Author
2022-04-25
Add tuning results for Radeon RX 6500 XT
Cedric Nugteren
2022-04-25
Add tuning results for Radeon RX 6800 XT
Cedric Nugteren
2022-04-13
android.hpp: custom header guard of _clang_
danyougle
2021-08-27
Add Quadro T2000 tuning parameters for the Tesla T4
Cedric Nugteren
2021-08-27
Remove Tesla T4 tuning results
Cedric Nugteren
2021-08-19
Add tuning results for NVIDIA Tesla V100
Cedric Nugteren
2021-08-19
Add tuning results for NVIDIA Tesla T4
Cedric Nugteren
2021-08-19
Add tuning results for NVIDIA Quadro T2000
Cedric Nugteren
2021-08-19
Add tuning results for NVIDIA Quadro GV100
Cedric Nugteren
2021-08-19
Add tuning results for Intel Core i9-9980HK
Cedric Nugteren
2021-08-19
Add tuning results for NVIDIA A100
Cedric Nugteren
2021-05-22
Fix issue with printing out-of-bounds local/global sizes for level 1 tuners
Cedric Nugteren
2021-03-13
set the correct flop count for xgemm
JishinMaster
2021-02-05
Fix Windows paths in pyclblast
Cedric Nugteren
2021-02-04
Added second Windows library path
Cedric Nugteren
2021-01-30
Add library path for Windows as well
Cedric Nugteren
2021-01-29
Add library dir on Linux for pyclblast
Cedric Nugteren
2021-01-21
Update pyclblast package version number
Cedric Nugteren
2021-01-20
Use reference types to prevent unnecessary copying
Jerry James
2020-10-10
Add tuning results for TITAN RTX
Cedric Nugteren
2020-10-10
Add tuning results for Radeon RX Vega
Cedric Nugteren
2020-06-07
Add a cautionary note in Program::GetIR and mention the fix in CHANGELOG
Pradeep Garigipati
2020-06-05
Fix Program::GetIR to handle programs with multiple devices
Pradeep Garigipati
2020-05-11
Increase display width of the local/global sizes
Cedric Nugteren
2020-05-10
Made sure that the global workgroup size is a multiple of the local size in t...
Cedric Nugteren
2020-05-10
Added logging of local/global workgroup sizes when run the tuners
Cedric Nugteren
2020-05-10
Updated PyCLBlast version number
Cedric Nugteren
2020-05-10
Added a sample to demonstrate a batched routine
Cedric Nugteren
2020-05-10
Added pyclblast bindings for the 3 batched routines
Cedric Nugteren
2020-05-03
Move queue creation out of the tuner loop
Cedric Nugteren
2020-03-08
Made it more likely (but no guarantees) for amax/amin to return the first index
Cedric Nugteren
2020-03-08
Silenced a new OpenCL warning message
Cedric Nugteren
2020-02-17
Catches all exceptions of the tuners
Cedric Nugteren
2019-12-09
Reduce TestMatrix calls for xgemmstridedbatched.
Tarmo Räntilä
2019-12-09
Reduce TestMatrix calls for xgemmbatched.
Tarmo Räntilä
2019-09-04
Fix out-of-bounds read/write in XhadFaster
etomzak
2019-05-19
Fixed a bug in the absolute-min index kernel
Cedric Nugteren
2019-05-11
Added a function to set the OpenCL kernel standard, either 1.1 or 1.2
Cedric Nugteren
2019-05-08
Changed back to cl_intel_subgroups as suggested
Cedric Nugteren
2019-05-07
Added a host-code check to make sure the avc_motion_estimation is available
Cedric Nugteren
2019-05-07
Enabled avc_motion_estimation extension for Intel subgroup shuffling
Cedric Nugteren
2019-05-03
Remove assert for extention not available in macOS
Umar Arshad
2019-02-09
Added tuning parameters for Tesla P100 16GB
Cedric Nugteren
2019-02-09
Added tuning parameters for Xeon E5-2630 v3 and v4
Cedric Nugteren
2019-01-23
Added fp32 to fp16 conversion function in Python to make haxpy example work
Cedric Nugteren
2019-01-22
Added a (non-working) sample of half precision AXPY in Python
Cedric Nugteren
2019-01-22
Updated pyclblast README, updated to 1.2.0 for half-precision support
Cedric Nugteren
2019-01-22
Added experimental support for half-precision in pyclblast
Cedric Nugteren
2019-01-19
Merge pull request #345 from CNugteren/convolution-fixes-and-tuner
Cedric Nugteren
2019-01-19
Added a few more initial Intel tuning parameters for convgemm
Cedric Nugteren
[next]