index
:
debian-clblast
debian/sid
upstream/latest
Debian package for CLBlast.
gspr@nonempty.org
summary
refs
log
tree
commit
diff
log msg
author
committer
range
Age
Commit message (
Expand
)
Author
2017-05-11
Added tuning results for the AMD Radeon Fiji GPU
Cedric Nugteren
2017-05-11
Fixes the build-status table in the README
Cedric Nugteren
2017-05-11
Bug-fix in the half-precision test of the amax routine
Cedric Nugteren
2017-05-11
Re-added random tuning for GEMM after accidental removal
Cedric Nugteren
2017-05-11
Minor naming fixes to the benchmark script
Cedric Nugteren
2017-05-11
Merge branch 'master_is_neww_devel_branch'
Cedric Nugteren
2017-05-03
The master branch is now the main 'development' branch
Cedric Nugteren
2017-05-02
Merge pull request #150 from CNugteren/development
Cedric Nugteren
2017-05-02
Updated to version 0.11.0
Cedric Nugteren
2017-04-23
Merge pull request #148 from CNugteren/benchmarking
Cedric Nugteren
2017-04-23
Added an option to the database script to remove tuning results from the data...
Cedric Nugteren
2017-04-23
Re-added Titan X (Pascal) tuning results based on more averaging when tuning
Cedric Nugteren
2017-04-23
Fixed a compiler warning message
Cedric Nugteren
2017-04-22
Increased the default number of runs for the tuner from 2 up to 10 for fast k...
Cedric Nugteren
2017-04-22
Fixed the direct vs indirect setting for NVIDIA GPUs
Cedric Nugteren
2017-04-21
Increased the default number of runs for GEMV tuning; updated GEMV tuning res...
Cedric Nugteren
2017-04-21
Merge branch 'development' into benchmarking
Cedric Nugteren
2017-04-21
Removed the words SUMMARY from the title of the benchmark script when benchma...
Cedric Nugteren
2017-04-20
Updated the settings for the batched benchmarks
Cedric Nugteren
2017-04-20
Tuned the direct versus indirect GEMM kernel trade-off point for NVIDIA GPUs
Cedric Nugteren
2017-04-17
Fixed a namespace clash with CUDA FP16 for the half-datatype
Cedric Nugteren
2017-04-17
Added proper handling of mismatched arguments in the database script
Cedric Nugteren
2017-04-16
Set proper settings for the benchmarks of batched routines
Cedric Nugteren
2017-04-16
Merge branch 'development' into benchmarking
Cedric Nugteren
2017-04-16
Merge pull request #147 from CNugteren/cublas_reference
Cedric Nugteren
2017-04-16
Finalized support for performance testing against cuBLAS
Cedric Nugteren
2017-04-16
Added settings for benchmarking batched routines
Cedric Nugteren
2017-04-14
Added a benchmark-all script to run multiple benchmarks automatically
Cedric Nugteren
2017-04-14
Tuned the num-runs settings for the benchmarks
Cedric Nugteren
2017-04-14
Added output-folder for benchmarking and removed the requirement on X
Cedric Nugteren
2017-04-14
Made the number of runs a benchmark-specific setting in the benchmark scripts
Cedric Nugteren
2017-04-14
Added a new Xaxpy kernel in between the regular and fast version in
Cedric Nugteren
2017-04-13
Fixed CUDA malloc and cuBLAS handles: cuBLAS as a performance-reference now w...
Cedric Nugteren
2017-04-11
Made compilation of the cuBLAS wrapper work properly
Cedric Nugteren
2017-04-10
Added reference implementations for performance-testing against cuBLAS
Cedric Nugteren
2017-04-10
Merge branch 'development' into cublas_reference
Cedric Nugteren
2017-04-10
Merge pull request #145 from CNugteren/apple_cpu_support
Cedric Nugteren
2017-04-10
Fixed a compilation issue under MSVC and GCC
Cedric Nugteren
2017-04-10
Removed const-vector-of-const-objects from the database class to remain accor...
Cedric Nugteren
2017-04-10
Updated the changelog with the Apple CPU override
Cedric Nugteren
2017-04-07
Added a special override database for the Apple CPU implementation on OS X: t...
Cedric Nugteren
2017-04-07
Uses float2 and double2 for base complex data-types instead of a custom struc...
Cedric Nugteren
2017-04-07
Added some missing const-ness
Cedric Nugteren
2017-04-06
Completed the cuBLAS wrapper
Cedric Nugteren
2017-04-06
Fixed some size_t to int conversion warnings for the CBLAS interface
Cedric Nugteren
2017-04-05
Added a first version of a cuBLAS wrapper (WIP)
Cedric Nugteren
2017-04-03
Fixes the CUDA wrapper (now actually tested on a system with CUDA)
Cedric Nugteren
2017-04-03
Added proper CMake searching for CUDA and cuBLAS
Cedric Nugteren
2017-04-03
In-lined the float2 and double2 types to avoid collision with CUDA's definitions
Cedric Nugteren
2017-04-02
Layed the groundwork for cuBLAS comparisons in the clients
Cedric Nugteren
[next]