index
:
debian-clblast
debian/sid
upstream/latest
Debian package for CLBlast.
gspr@nonempty.org
summary
refs
log
tree
commit
diff
log msg
author
committer
range
path:
root
/
src
Age
Commit message (
Expand
)
Author
2017-03-11
Added initial naive version of the batched GEMM routine based on the direct G...
Cedric Nugteren
2017-03-10
Added API and test infrastructure for the batched GEMM routine
Cedric Nugteren
2017-03-10
Added proper testing of the alpha parameter; finalized the batched AXPY imple...
Cedric Nugteren
2017-03-10
Fixed a small compilation bug for MSVC related to a floating-point constant
Cedric Nugteren
2017-03-08
Implemented a batched version of the AXPY kernel
Cedric Nugteren
2017-03-08
Make batched routines based on offsets instead of a vector of cl_mem objects ...
Cedric Nugteren
2017-03-05
Minor fixes to the client w.r.t. the addition of the batch count
Cedric Nugteren
2017-03-05
Added first naive version of the batched AXPY routine
Cedric Nugteren
2017-03-05
Adjusted the test-infrastructure to support testing of batched-versions of ro...
Cedric Nugteren
2017-03-05
Changed the way the test-data is generated: now using a single MT generator a...
Cedric Nugteren
2017-03-05
Prepared generator for batched routines; added batched AXPY routine interface
Cedric Nugteren
2017-03-04
Added tuning results for the Radeon HD6750M GPU (Apple OpenCL)
Cedric Nugteren
2017-03-04
Added a proper data-preparation function for the TRSM tests
Cedric Nugteren
2017-03-01
Added proper support for the b_offset argument in TRSM
Cedric Nugteren
2017-02-27
Fixed half-precision bugs in HTBMV/HTPMV/HTRMV/HSYR2K/HTRMM related to incorr...
Cedric Nugteren
2017-02-26
Split the GEMM kernel further up to prevent C1091 in MSVC
Cedric Nugteren
2017-02-26
Merge branch 'development' into triangular_solvers
Cedric Nugteren
2017-02-26
Fixed an out-of-bounds memory access when filling a matrix with a constant
Cedric Nugteren
2017-02-26
Removed half-precision support from the TRSM routine; too unstable
Cedric Nugteren
2017-02-26
Fixes division in the kernel for inversion of complex numbers
Cedric Nugteren
2017-02-25
Added PrepareData function for TRSM to create proper test input
Cedric Nugteren
2017-02-24
Implemented a simple row-major to col-major problem conversion for TRSM
Cedric Nugteren
2017-02-22
Fixed a few issues with the TRSM routine; some tests still failing
Cedric Nugteren
2017-02-19
Added data-preparation function for the TRSV tests and special nan/inf checks...
Cedric Nugteren
2017-02-18
Added tuning parameters for the AMD RX480 GPU (Ellesmere)
Cedric Nugteren
2017-02-18
Fixed the naming of the C API of OverrideParameters and fixed the description
Cedric Nugteren
2017-02-16
Added a C interface to the OverrideParameters function; added some in-line co...
Cedric Nugteren
2017-02-16
Added input-sanity checks for the OverrideParameters function
Cedric Nugteren
2017-02-13
Added first version of the OverrideParameters function
Cedric Nugteren
2017-02-13
Fixed a small bug in GEMV: unused kernel in parameter list
Cedric Nugteren
2017-02-12
Split the database into several smaller cached per-kernel databases (in prepa...
Cedric Nugteren
2017-02-12
Made RemoveBySubset from the cache work with references to keys
Cedric Nugteren
2017-02-11
Added an option to remove items from the caches, optionally by a subset of 2 ...
Cedric Nugteren
2017-02-08
Added tuning results for Titan X (Pascal version)
Cedric Nugteren
2017-02-05
Merge branch 'development' into triangular_solvers
Cedric Nugteren
2017-02-05
Fixed complex version of the TRSV kernel
Cedric Nugteren
2017-02-04
Improved substition kernels a bit; added complex support
Cedric Nugteren
2017-02-04
Completed a first STRSV implementation
Cedric Nugteren
2017-02-04
Added row-major support for TRSV
Cedric Nugteren
2017-01-29
Added first (incomplete) version of TRSV routine
Cedric Nugteren
2017-01-24
Database: pass Device instead of Queue for clarity
Ivan Shapovalov
2017-01-24
Routine: cache the database instance as well
Ivan Shapovalov
2017-01-24
Database: ref-count the internal map for caching
Ivan Shapovalov
2017-01-24
Routine, Cache: generalize, reduce amount of copying in fast path
Ivan Shapovalov
2017-01-24
FillCache: perform compilation for each precision separately
Ivan Shapovalov
2017-01-24
Routine: fix semi-warm routine construction (when binary is in cache)
Ivan Shapovalov
2017-01-24
src/clpp11.hpp: check pointers before clRelease*()
Ivan Shapovalov
2017-01-24
src/clpp11.hpp: do not store program source/binary in Program
Ivan Shapovalov
2017-01-20
treewide: include clpp11.hpp first to silence deprecation warnings
Ivan Shapovalov
2017-01-20
Routine: use PrecisionSupported<>() instead of duplicating the check
Ivan Shapovalov
[next]