index
:
debian-clblast
debian/sid
upstream/latest
Debian package for CLBlast.
gspr@nonempty.org
summary
refs
log
tree
commit
diff
log msg
author
committer
range
Age
Commit message (
Expand
)
Author
2017-04-14
Tuned the num-runs settings for the benchmarks
Cedric Nugteren
2017-04-14
Added output-folder for benchmarking and removed the requirement on X
Cedric Nugteren
2017-04-14
Made the number of runs a benchmark-specific setting in the benchmark scripts
Cedric Nugteren
2017-04-14
Added a new Xaxpy kernel in between the regular and fast version in
Cedric Nugteren
2017-04-10
Merge pull request #145 from CNugteren/apple_cpu_support
Cedric Nugteren
2017-04-10
Fixed a compilation issue under MSVC and GCC
Cedric Nugteren
2017-04-10
Removed const-vector-of-const-objects from the database class to remain accor...
Cedric Nugteren
2017-04-10
Updated the changelog with the Apple CPU override
Cedric Nugteren
2017-04-07
Added a special override database for the Apple CPU implementation on OS X: t...
Cedric Nugteren
2017-04-07
Uses float2 and double2 for base complex data-types instead of a custom struc...
Cedric Nugteren
2017-04-07
Added some missing const-ness
Cedric Nugteren
2017-04-02
Merge pull request #144 from CNugteren/matplotlib_graphs
Cedric Nugteren
2017-04-02
Merge pull request #143 from CNugteren/test_cblas_timing
Cedric Nugteren
2017-04-02
Various tweaks to the new benchmark script
Cedric Nugteren
2017-04-01
Tuned the plots for a tight-layout for in papers and presentations
Cedric Nugteren
2017-04-01
Separated host-device and device-host memory copies from execution of the CBL...
Cedric Nugteren
2017-03-26
Replaced the R graph scripts with Python/Matplotlib benchmark scripts
Cedric Nugteren
2017-03-20
Fixed a GCC/MSVC compilation issue
Cedric Nugteren
2017-03-19
Merge pull request #142 from CNugteren/gemm_batched
Cedric Nugteren
2017-03-19
Fixed a compilation issue for GCC/MSVC
Cedric Nugteren
2017-03-19
Added an (optional) non-direct implementation of the batched GEMM routine
Cedric Nugteren
2017-03-19
Added batched versions of the pad/copy/transpose kernels
Cedric Nugteren
2017-03-14
Added the possibility to tune batched kernels
Cedric Nugteren
2017-03-12
Fixed a linker issue for Clang
Cedric Nugteren
2017-03-11
Added initial naive version of the batched GEMM routine based on the direct G...
Cedric Nugteren
2017-03-10
Added API and test infrastructure for the batched GEMM routine
Cedric Nugteren
2017-03-10
Merge pull request #141 from CNugteren/axpy_batched
Cedric Nugteren
2017-03-10
Small fix for a file that isn't currently compiled anymore
Cedric Nugteren
2017-03-10
Added proper testing of the alpha parameter; finalized the batched AXPY imple...
Cedric Nugteren
2017-03-10
Fixed a small compilation bug for MSVC related to a floating-point constant
Cedric Nugteren
2017-03-08
Implemented a batched version of the AXPY kernel
Cedric Nugteren
2017-03-08
Make batched routines based on offsets instead of a vector of cl_mem objects ...
Cedric Nugteren
2017-03-05
Minor fixes to the client w.r.t. the addition of the batch count
Cedric Nugteren
2017-03-05
Added first naive version of the batched AXPY routine
Cedric Nugteren
2017-03-05
Adjusted the test-infrastructure to support testing of batched-versions of ro...
Cedric Nugteren
2017-03-05
Changed the way the test-data is generated: now using a single MT generator a...
Cedric Nugteren
2017-03-05
Prepared generator for batched routines; added batched AXPY routine interface
Cedric Nugteren
2017-03-04
Fixed a missing include for the tests
Cedric Nugteren
2017-03-04
Added tuning results for the Radeon HD6750M GPU (Apple OpenCL)
Cedric Nugteren
2017-03-04
Added a proper data-preparation function for the TRSM tests
Cedric Nugteren
2017-03-01
Added proper support for the b_offset argument in TRSM
Cedric Nugteren
2017-03-01
Made a double to float cast explicit for MSVC compatibility (C2397)
Cedric Nugteren
2017-02-27
Added L2 error computation and checking for half-precision tests
Cedric Nugteren
2017-02-27
Fixed half-precision bugs in HTBMV/HTPMV/HTRMV/HSYR2K/HTRMM related to incorr...
Cedric Nugteren
2017-02-26
Updated the README documentation
Cedric Nugteren
2017-02-26
Merge pull request #138 from CNugteren/triangular_solvers
Cedric Nugteren
2017-02-26
Split the GEMM kernel further up to prevent C1091 in MSVC
Cedric Nugteren
2017-02-26
Minor fix to the generator script
Cedric Nugteren
2017-02-26
Merge branch 'development' into triangular_solvers
Cedric Nugteren
2017-02-26
Added a guard against invalid buffer sizes in the prepare-data functions for ...
Cedric Nugteren
[next]