index
:
debian-clblast
debian/sid
upstream/latest
Debian package for CLBlast.
gspr@nonempty.org
summary
refs
log
tree
commit
diff
log msg
author
committer
range
path:
root
/
src
Age
Commit message (
Expand
)
Author
2016-05-22
Fixed tuning results for half-precision; added first results for the xGER ker...
Cedric Nugteren
2016-05-22
Prepared the GER kernels and tuner for half-precision support
Cedric Nugteren
2016-05-22
Added level-2 half-precision routines HGEMV/HGBMV/HHEMV/HHBMV/HHPMV/HSYMV/HSB...
Cedric Nugteren
2016-05-22
Added first tuning results for the half-precision xGEMV kernels
Cedric Nugteren
2016-05-22
Prepared the GEMV kernels and tuner for half-precision support
Cedric Nugteren
2016-05-22
Added level-1 half-precision routines HSWAP/HSCAL/HCOPY/HAXPY/HDOT/HNRM2/HASU...
Cedric Nugteren
2016-05-22
Added first tuning results for the half-precision xDOT kernels
Cedric Nugteren
2016-05-22
Added half-precision support for all level 1 routines
Cedric Nugteren
2016-05-18
Merged in latest changes from 0.7.1 release
Cedric Nugteren
2016-05-16
Added half precision tuning results for supporting kernels (pad, copy, transp...
Cedric Nugteren
2016-05-16
Prepared GEMM and supporting kernels and tuners for half-precision support
Cedric Nugteren
2016-05-15
Added header with conversions from and to half-precision floating-point
Cedric Nugteren
2016-05-14
Set kernel arguments for AXPY as constant memory buffers, making it possible ...
Cedric Nugteren
2016-05-13
Initial experimental version of the half-precision HAXPY routine
Cedric Nugteren
2016-05-12
Initial changes in preparation for half-precision fp16 support
Cedric Nugteren
2016-05-08
Fixed errors in xAXPY and xSCAL tests on AMD hardware
cnugteren
2016-05-02
Fixed the calculation of the required buffer sizes in case of subvectors and ...
Cedric Nugteren
2016-05-01
Made the default xDOT tuning size smaller
Cedric Nugteren
2016-05-01
Changed the index buffer of IxAMAX routines to unsigned int for proper buffer...
Cedric Nugteren
2016-05-01
Added a program cache (per-context) next to the per-device binary cache
Cedric Nugteren
2016-04-30
Added non-aboslute minimum counter-part IxMIN of the BLAS routine IxAMAX
Cedric Nugteren
2016-04-29
Added FillCache: a function to pre-compile all kernels for a specific device
Cedric Nugteren
2016-04-28
Fixed the cache to store binaries instead of OpenCL programs
Cedric Nugteren
2016-04-27
Added non-absolute counter-parts xSUM and IxMAX of the BLAS routines xASUM an...
Cedric Nugteren
2016-04-27
Added prototypes for non-BLAS routines: xSUM and IxMAX (non-absolute counterp...
Cedric Nugteren
2016-04-27
Moved all cache-related functions to a separate file; added a ClearCompiledPr...
Cedric Nugteren
2016-04-20
Added support for the iSAMAX/iDAMAX/iCAMAX/iZAMAX routines
cnugteren
2016-04-20
Added prototype for ixAMAX routines
cnugteren
2016-04-14
Updated the reduction-kernel tuner to also tune the epilogue
cnugteren
2016-04-14
Added support for the SASUM/DASUM/ScASUM/DzASUM routines
cnugteren
2016-04-13
Added prototype for xASUM routines
cnugteren
2016-04-09
Events are now properly implemented using event waiting list and asking the u...
cnugteren
2016-04-04
Removed redundant queue synchronisation statements
cnugteren
2016-04-01
Added a wrapper for CBLAS libraries for performance/correctness testing
cnugteren
2016-03-30
Merge branch 'level1_routines' into development
cnugteren
2016-03-30
Fixed the nrm2 kernel for complex data-types
cnugteren
2016-03-30
Added prototypes for the xROTM and xROTMG routines
Cedric Nugteren
2016-03-30
Added prototypes for the xROT and xROTG functions
Cedric Nugteren
2016-03-30
Fixed properly passing of OpenCL events to CLBlast functions
Cedric Nugteren
2016-03-28
Added preliminary support for the xNRM2 routines
Cedric Nugteren
2016-03-25
Added prototypes for ScNRM2/DzNRM2 routines
Cedric Nugteren
2016-03-25
Added prototypes for SNRM2/DNRM2 routines
Cedric Nugteren
2016-03-23
Fixed the C-api export to be able to properly build a DLL on Windows
Cedric Nugteren
2016-03-19
Added __declspec(dllexport) to create a DLL on Windows
Cedric Nugteren
2016-03-14
Made the library thread-safe by guarding the kernel cache with a mutex
Cedric Nugteren
2016-03-06
Fixed a bug in the GER-family of routines due to incorrect division of the wo...
Cedric Nugteren
2016-03-06
Added preliminary support for xHPR2 and xSPR2 routines
Cedric Nugteren
2016-03-02
Added preliminary support for xHER2 and xSYR2 routines
Cedric Nugteren
2016-02-28
Fixed a couple of correctness bugs in the Xher kernels
Cedric Nugteren
2016-02-28
Added support for xHER, xHPR, xSYR, and xSPR routines
Cedric Nugteren
[next]