index
:
debian-clblast
debian/sid
upstream/latest
Debian package for CLBlast.
gspr@nonempty.org
summary
refs
log
tree
commit
diff
log msg
author
committer
range
path:
root
/
scripts
Age
Commit message (
Expand
)
Author
2018-01-25
Improved the benchmark scripts; added gemmstridedbatched benchmark
Cedric Nugteren
2018-01-14
Small improvements to benchmarking for cuBLAS
Cedric Nugteren
2018-01-11
Added a RetrieveParameters function to inspect tuning parameters
Cedric Nugteren
2018-01-07
Added API and tests for new GemmStridedBatched routine
Cedric Nugteren
2018-01-06
Fixed a minor nullptr related issue in the code generator
Cedric Nugteren
2018-01-06
Merge pull request #238 from CNugteren/gemm_api_with_temp_buffer
Cedric Nugteren
2018-01-06
Added CUDA interface to get temporary-buffer size for GEMM routine
Cedric Nugteren
2018-01-04
Added a CUDA version of the GEMM temp-buffer optional argument
Cedric Nugteren
2018-01-04
Updated the generator script to automatically generate the temp-buffer code
Cedric Nugteren
2017-12-31
Made plotting script more flexible: extra argument to set the comparison library
Cedric Nugteren
2017-12-28
Added interface to compute the required temporary buffer size for GEMM
Cedric Nugteren
2017-12-27
Split the database into multiple small compilation units
Cedric Nugteren
2017-12-20
Made plotting script more resilient to missing data
Cedric Nugteren
2017-12-20
Added tuning results for Apple AMD Radeon Pro 580
Cedric Nugteren
2017-12-20
Added try-except to database script parser to skip invalid files
Cedric Nugteren
2017-11-20
Made the database script properly handle multiple entries for a single device
Cedric Nugteren
2017-11-19
Minor fix to the database script
Cedric Nugteren
2017-11-19
Some fixed for the new auto-tuner to be compatible with the Python scripts
Cedric Nugteren
2017-11-06
Improved the way the database defaults are computed
Cedric Nugteren
2017-11-02
Integrated the GEMM routine tuner for kernel selection; added first tuning re...
Cedric Nugteren
2017-11-02
Fixed a bug in database compression/decompression
Cedric Nugteren
2017-10-14
Various fixes to make the host code and sample compile with the CUDA API
Cedric Nugteren
2017-10-12
CUDA API now takes context and device in instead of stream
Cedric Nugteren
2017-10-11
Added first (untested) version of a CUDA API
Cedric Nugteren
2017-10-09
Fixed the Python generator script w.r.t. the recent change of testing direct/...
Cedric Nugteren
2017-10-08
Moved non-routine-specific API functions and includes to separate files
Cedric Nugteren
2017-09-16
Improved compilation time of the tuner database
Cedric Nugteren
2017-09-14
Added architecture layer in the tuning database for better performance on uns...
Cedric Nugteren
2017-09-12
Added database compress and de-compress functions
Cedric Nugteren
2017-09-11
Database now works with new format of clblast_[property]
Cedric Nugteren
2017-09-06
Split the database files over multiple directories and files; first step towa...
Cedric Nugteren
2017-07-02
Added interface and stubs for the im2col routine
Cedric Nugteren
2017-06-25
Fixed some Clang and MSVC warnings
Cedric Nugteren
2017-06-21
Fixes some compilation issues related to the database structure change
Cedric Nugteren
2017-06-20
Changed the structure of the database to reduce compilation time and save memory
Cedric Nugteren
2017-05-24
changing "wb" to "w" when saving json file (text mode) - compatibility for Py...
Grigori Fursin
2017-05-12
Added the IxAMIN routines: absolute minimum version of IxAMAX
Cedric Nugteren
2017-05-11
Minor naming fixes to the benchmark script
Cedric Nugteren
2017-04-23
Added an option to the database script to remove tuning results from the data...
Cedric Nugteren
2017-04-23
Re-added Titan X (Pascal) tuning results based on more averaging when tuning
Cedric Nugteren
2017-04-21
Merge branch 'development' into benchmarking
Cedric Nugteren
2017-04-21
Removed the words SUMMARY from the title of the benchmark script when benchma...
Cedric Nugteren
2017-04-20
Updated the settings for the batched benchmarks
Cedric Nugteren
2017-04-17
Fixed a namespace clash with CUDA FP16 for the half-datatype
Cedric Nugteren
2017-04-17
Added proper handling of mismatched arguments in the database script
Cedric Nugteren
2017-04-16
Set proper settings for the benchmarks of batched routines
Cedric Nugteren
2017-04-16
Merge branch 'development' into benchmarking
Cedric Nugteren
2017-04-16
Added settings for benchmarking batched routines
Cedric Nugteren
2017-04-14
Added a benchmark-all script to run multiple benchmarks automatically
Cedric Nugteren
2017-04-14
Tuned the num-runs settings for the benchmarks
Cedric Nugteren
[next]