summaryrefslogtreecommitdiff
path: root/scripts
AgeCommit message (Expand)Author
2018-01-31Created the API and stubs for the HAD (hadamard-product) routinesCedric Nugteren
2018-01-27Some fixes to the benchmark scriptsCedric Nugteren
2018-01-26Minor displaying improvements to the graph plotting scriptsCedric Nugteren
2018-01-25Improved the benchmark scripts; added gemmstridedbatched benchmarkCedric Nugteren
2018-01-14Small improvements to benchmarking for cuBLASCedric Nugteren
2018-01-11Added a RetrieveParameters function to inspect tuning parametersCedric Nugteren
2018-01-07Added API and tests for new GemmStridedBatched routineCedric Nugteren
2018-01-06Fixed a minor nullptr related issue in the code generatorCedric Nugteren
2018-01-06Merge pull request #238 from CNugteren/gemm_api_with_temp_bufferCedric Nugteren
2018-01-06Added CUDA interface to get temporary-buffer size for GEMM routineCedric Nugteren
2018-01-04Added a CUDA version of the GEMM temp-buffer optional argumentCedric Nugteren
2018-01-04Updated the generator script to automatically generate the temp-buffer codeCedric Nugteren
2017-12-31Made plotting script more flexible: extra argument to set the comparison libraryCedric Nugteren
2017-12-28Added interface to compute the required temporary buffer size for GEMMCedric Nugteren
2017-12-27Split the database into multiple small compilation unitsCedric Nugteren
2017-12-20Made plotting script more resilient to missing dataCedric Nugteren
2017-12-20Added tuning results for Apple AMD Radeon Pro 580Cedric Nugteren
2017-12-20Added try-except to database script parser to skip invalid filesCedric Nugteren
2017-11-20Made the database script properly handle multiple entries for a single deviceCedric Nugteren
2017-11-19Minor fix to the database scriptCedric Nugteren
2017-11-19Some fixed for the new auto-tuner to be compatible with the Python scriptsCedric Nugteren
2017-11-06Improved the way the database defaults are computedCedric Nugteren
2017-11-02Integrated the GEMM routine tuner for kernel selection; added first tuning re...Cedric Nugteren
2017-11-02Fixed a bug in database compression/decompressionCedric Nugteren
2017-10-14Various fixes to make the host code and sample compile with the CUDA APICedric Nugteren
2017-10-12CUDA API now takes context and device in instead of streamCedric Nugteren
2017-10-11Added first (untested) version of a CUDA APICedric Nugteren
2017-10-09Fixed the Python generator script w.r.t. the recent change of testing direct/...Cedric Nugteren
2017-10-08Moved non-routine-specific API functions and includes to separate filesCedric Nugteren
2017-09-16Improved compilation time of the tuner databaseCedric Nugteren
2017-09-14Added architecture layer in the tuning database for better performance on uns...Cedric Nugteren
2017-09-12Added database compress and de-compress functionsCedric Nugteren
2017-09-11Database now works with new format of clblast_[property]Cedric Nugteren
2017-09-06Split the database files over multiple directories and files; first step towa...Cedric Nugteren
2017-07-02Added interface and stubs for the im2col routineCedric Nugteren
2017-06-25Fixed some Clang and MSVC warningsCedric Nugteren
2017-06-21Fixes some compilation issues related to the database structure changeCedric Nugteren
2017-06-20Changed the structure of the database to reduce compilation time and save memoryCedric Nugteren
2017-05-24changing "wb" to "w" when saving json file (text mode) - compatibility for Py...Grigori Fursin
2017-05-12Added the IxAMIN routines: absolute minimum version of IxAMAXCedric Nugteren
2017-05-11Minor naming fixes to the benchmark scriptCedric Nugteren
2017-04-23Added an option to the database script to remove tuning results from the data...Cedric Nugteren
2017-04-23Re-added Titan X (Pascal) tuning results based on more averaging when tuningCedric Nugteren
2017-04-21Merge branch 'development' into benchmarkingCedric Nugteren
2017-04-21Removed the words SUMMARY from the title of the benchmark script when benchma...Cedric Nugteren
2017-04-20Updated the settings for the batched benchmarksCedric Nugteren
2017-04-17Fixed a namespace clash with CUDA FP16 for the half-datatypeCedric Nugteren
2017-04-17Added proper handling of mismatched arguments in the database scriptCedric Nugteren
2017-04-16Set proper settings for the benchmarks of batched routinesCedric Nugteren
2017-04-16Merge branch 'development' into benchmarkingCedric Nugteren