index
:
debian-clblast
debian/sid
upstream/latest
Debian package for CLBlast.
gspr@nonempty.org
summary
refs
log
tree
commit
diff
log msg
author
committer
range
path:
root
/
src
Age
Commit message (
Expand
)
Author
2017-02-26
Removed half-precision support from the TRSM routine; too unstable
Cedric Nugteren
2017-02-26
Fixes division in the kernel for inversion of complex numbers
Cedric Nugteren
2017-02-25
Added PrepareData function for TRSM to create proper test input
Cedric Nugteren
2017-02-24
Implemented a simple row-major to col-major problem conversion for TRSM
Cedric Nugteren
2017-02-22
Fixed a few issues with the TRSM routine; some tests still failing
Cedric Nugteren
2017-02-19
Added data-preparation function for the TRSV tests and special nan/inf checks...
Cedric Nugteren
2017-02-05
Merge branch 'development' into triangular_solvers
Cedric Nugteren
2017-02-05
Fixed complex version of the TRSV kernel
Cedric Nugteren
2017-02-04
Improved substition kernels a bit; added complex support
Cedric Nugteren
2017-02-04
Completed a first STRSV implementation
Cedric Nugteren
2017-02-04
Added row-major support for TRSV
Cedric Nugteren
2017-01-29
Added first (incomplete) version of TRSV routine
Cedric Nugteren
2017-01-24
Database: pass Device instead of Queue for clarity
Ivan Shapovalov
2017-01-24
Routine: cache the database instance as well
Ivan Shapovalov
2017-01-24
Database: ref-count the internal map for caching
Ivan Shapovalov
2017-01-24
Routine, Cache: generalize, reduce amount of copying in fast path
Ivan Shapovalov
2017-01-24
FillCache: perform compilation for each precision separately
Ivan Shapovalov
2017-01-24
Routine: fix semi-warm routine construction (when binary is in cache)
Ivan Shapovalov
2017-01-24
src/clpp11.hpp: check pointers before clRelease*()
Ivan Shapovalov
2017-01-24
src/clpp11.hpp: do not store program source/binary in Program
Ivan Shapovalov
2017-01-20
treewide: include clpp11.hpp first to silence deprecation warnings
Ivan Shapovalov
2017-01-20
Routine: use PrecisionSupported<>() instead of duplicating the check
Ivan Shapovalov
2017-01-20
Added prototype for the TRSV routine
Cedric Nugteren
2017-01-20
Set number of decimals for floating-point printing for error reporting
Cedric Nugteren
2017-01-19
Added tuning results for NVIDIA GTX 1080 and Intel Core i7-4790K
Cedric Nugteren
2017-01-18
Added first version of the TRSM routine based on the diagonal invert kernel
Cedric Nugteren
2017-01-15
Added a first version of the diagonal block invert routine in preparation of ...
Cedric Nugteren
2017-01-15
Prints additional information in verbose/debug mode
Cedric Nugteren
2017-01-07
Always enables cl_khr_fp64 when running double-precision, not just for OpenCL...
Cedric Nugteren
2017-01-03
Added tuning results for the AMD Turks GPU and the Intel Core i7-2670QM CPU
Cedric Nugteren
2016-12-18
Prepared for the addition of the TRSM triangular solver kernel
Cedric Nugteren
2016-12-18
Fixed a bug when using offsets in the direct GEMM kernels
Cedric Nugteren
2016-11-29
Made Intel GPUs always use the indirect version of the GEMM kernel
Cedric Nugteren
2016-11-27
Made it possible to use the command-line environmental variables for each exe...
Cedric Nugteren
2016-11-26
Improved the default parameters for cases with non-common parameters across a...
Cedric Nugteren
2016-11-24
Merge pull request #125 from CNugteren/netlib_blas_api
Cedric Nugteren
2016-11-23
Fixed a vector-size related bug in the CLBlast Netlib API
Cedric Nugteren
2016-11-23
Fixed a bug in the HSCAL routine
Cedric Nugteren
2016-11-22
Minor changes to ensure full compatibility with the Netlib CBLAS API
Cedric Nugteren
2016-11-20
Made functions with scalar-buffers as output properly return values
Cedric Nugteren
2016-11-20
Now correctly tests for validaty of the B matrix in the TRMM routine
Cedric Nugteren
2016-11-20
Forced OpenCL 1.1 compilation and disabled a deprecation warning
Cedric Nugteren
2016-11-20
Fixed a bug in the TRMM routine caused by overwriting input data before consu...
Cedric Nugteren
2016-11-19
Changed the GEMM kernel selection parameters for Skylake GPUs to always favou...
Cedric Nugteren
2016-11-15
Updated the tuning results for the Intel Skylake ULT GT2 GPU
Cedric Nugteren
2016-10-25
Renamed the include and source files of the Netlib CBLAS API
Cedric Nugteren
2016-10-25
Removed the clblast namespace from the Netlib C API source file to ensure pro...
Cedric Nugteren
2016-10-25
Fixed some issues preventing the Netlib CBLAS API from linking correctly
Cedric Nugteren
2016-10-25
Made the Netlib CBLAS API use the same enums with prefixes as the regular C A...
Cedric Nugteren
2016-10-25
Sets the proper sizes for the buffers for the Netlib CBLAS API
Cedric Nugteren
[next]