index
:
debian-clblast
debian/sid
upstream/latest
Debian package for CLBlast.
gspr@nonempty.org
summary
refs
log
tree
commit
diff
log msg
author
committer
range
path:
root
/
scripts
Age
Commit message (
Expand
)
Author
2017-11-20
Made the database script properly handle multiple entries for a single device
Cedric Nugteren
2017-11-19
Minor fix to the database script
Cedric Nugteren
2017-11-19
Some fixed for the new auto-tuner to be compatible with the Python scripts
Cedric Nugteren
2017-11-06
Improved the way the database defaults are computed
Cedric Nugteren
2017-11-02
Integrated the GEMM routine tuner for kernel selection; added first tuning re...
Cedric Nugteren
2017-11-02
Fixed a bug in database compression/decompression
Cedric Nugteren
2017-10-14
Various fixes to make the host code and sample compile with the CUDA API
Cedric Nugteren
2017-10-12
CUDA API now takes context and device in instead of stream
Cedric Nugteren
2017-10-11
Added first (untested) version of a CUDA API
Cedric Nugteren
2017-10-09
Fixed the Python generator script w.r.t. the recent change of testing direct/...
Cedric Nugteren
2017-10-08
Moved non-routine-specific API functions and includes to separate files
Cedric Nugteren
2017-09-16
Improved compilation time of the tuner database
Cedric Nugteren
2017-09-14
Added architecture layer in the tuning database for better performance on uns...
Cedric Nugteren
2017-09-12
Added database compress and de-compress functions
Cedric Nugteren
2017-09-11
Database now works with new format of clblast_[property]
Cedric Nugteren
2017-09-06
Split the database files over multiple directories and files; first step towa...
Cedric Nugteren
2017-07-02
Added interface and stubs for the im2col routine
Cedric Nugteren
2017-06-25
Fixed some Clang and MSVC warnings
Cedric Nugteren
2017-06-21
Fixes some compilation issues related to the database structure change
Cedric Nugteren
2017-06-20
Changed the structure of the database to reduce compilation time and save memory
Cedric Nugteren
2017-05-24
changing "wb" to "w" when saving json file (text mode) - compatibility for Py...
Grigori Fursin
2017-05-12
Added the IxAMIN routines: absolute minimum version of IxAMAX
Cedric Nugteren
2017-05-11
Minor naming fixes to the benchmark script
Cedric Nugteren
2017-04-23
Added an option to the database script to remove tuning results from the data...
Cedric Nugteren
2017-04-23
Re-added Titan X (Pascal) tuning results based on more averaging when tuning
Cedric Nugteren
2017-04-21
Merge branch 'development' into benchmarking
Cedric Nugteren
2017-04-21
Removed the words SUMMARY from the title of the benchmark script when benchma...
Cedric Nugteren
2017-04-20
Updated the settings for the batched benchmarks
Cedric Nugteren
2017-04-17
Fixed a namespace clash with CUDA FP16 for the half-datatype
Cedric Nugteren
2017-04-17
Added proper handling of mismatched arguments in the database script
Cedric Nugteren
2017-04-16
Set proper settings for the benchmarks of batched routines
Cedric Nugteren
2017-04-16
Merge branch 'development' into benchmarking
Cedric Nugteren
2017-04-16
Added settings for benchmarking batched routines
Cedric Nugteren
2017-04-14
Added a benchmark-all script to run multiple benchmarks automatically
Cedric Nugteren
2017-04-14
Tuned the num-runs settings for the benchmarks
Cedric Nugteren
2017-04-14
Added output-folder for benchmarking and removed the requirement on X
Cedric Nugteren
2017-04-14
Made the number of runs a benchmark-specific setting in the benchmark scripts
Cedric Nugteren
2017-04-13
Fixed CUDA malloc and cuBLAS handles: cuBLAS as a performance-reference now w...
Cedric Nugteren
2017-04-11
Made compilation of the cuBLAS wrapper work properly
Cedric Nugteren
2017-04-10
Merge branch 'development' into cublas_reference
Cedric Nugteren
2017-04-10
Removed const-vector-of-const-objects from the database class to remain accor...
Cedric Nugteren
2017-04-06
Completed the cuBLAS wrapper
Cedric Nugteren
2017-04-05
Added a first version of a cuBLAS wrapper (WIP)
Cedric Nugteren
2017-04-03
In-lined the float2 and double2 types to avoid collision with CUDA's definitions
Cedric Nugteren
2017-04-02
Various tweaks to the new benchmark script
Cedric Nugteren
2017-04-01
Tuned the plots for a tight-layout for in papers and presentations
Cedric Nugteren
2017-03-26
Replaced the R graph scripts with Python/Matplotlib benchmark scripts
Cedric Nugteren
2017-03-10
Added API and test infrastructure for the batched GEMM routine
Cedric Nugteren
2017-03-08
Make batched routines based on offsets instead of a vector of cl_mem objects ...
Cedric Nugteren
2017-03-05
Added first naive version of the batched AXPY routine
Cedric Nugteren
[next]