index
:
debian-clblast
debian/sid
upstream/latest
Debian package for CLBlast.
gspr@nonempty.org
summary
refs
log
tree
commit
diff
log msg
author
committer
range
path:
root
/
include
Age
Commit message (
Expand
)
Author
2016-06-18
Clean-up of the routine class, moved RunKernel to the routine/common file
Cedric Nugteren
2016-06-18
Removed the template from the Routine base-class
Cedric Nugteren
2016-06-17
Removed the precision argument from the routines in favor of a single templat...
Cedric Nugteren
2016-06-17
Removed the interface to the cache functions from the Routine class, calls th...
Cedric Nugteren
2016-06-17
Moved the RunKernel and PadCopyTransposeMatrix functions out of the Routine c...
Cedric Nugteren
2016-06-17
Moved the ErrorIn function from the Routine class to the utilities header
Cedric Nugteren
2016-06-17
Moved the test-for-valid-buffers function from the Routine class to separate ...
Cedric Nugteren
2016-06-16
Added XOMATCOPY routines to perform out-of-place matrix scaling, copying, and...
Cedric Nugteren
2016-06-15
Added some constness to variables related to the GEMM routines
Cedric Nugteren
2016-06-14
Moved device vendor and type checks to a common header
Cedric Nugteren
2016-06-08
Added global memory synchronisation for better cache performance on ARM Mali ...
Cedric Nugteren
2016-06-01
Added tuning parameters for 'GRID K520' and 'HD Graphics Skylake ULT GT2'
Cedric Nugteren
2016-05-26
Added half-precision tests for the clBLAS reference through conversion to sin...
Cedric Nugteren
2016-05-25
Added level-3 half-precision routines HGEMM/HSYMM/HSYRK/HSYR2K/HTRMM
Cedric Nugteren
2016-05-22
Added level-2 half-precision routines HGER/HSYR/HSPR/HSYR2/HSPR2
Cedric Nugteren
2016-05-22
Fixed tuning results for half-precision; added first results for the xGER ker...
Cedric Nugteren
2016-05-22
Prepared the GER kernels and tuner for half-precision support
Cedric Nugteren
2016-05-22
Added level-2 half-precision routines HGEMV/HGBMV/HHEMV/HHBMV/HHPMV/HSYMV/HSB...
Cedric Nugteren
2016-05-22
Added first tuning results for the half-precision xGEMV kernels
Cedric Nugteren
2016-05-22
Prepared the GEMV kernels and tuner for half-precision support
Cedric Nugteren
2016-05-22
Added level-1 half-precision routines HSWAP/HSCAL/HCOPY/HAXPY/HDOT/HNRM2/HASU...
Cedric Nugteren
2016-05-22
Added first tuning results for the half-precision xDOT kernels
Cedric Nugteren
2016-05-18
Merged in latest changes from 0.7.1 release
Cedric Nugteren
2016-05-16
Added half precision tuning results for supporting kernels (pad, copy, transp...
Cedric Nugteren
2016-05-15
Added header with conversions from and to half-precision floating-point
Cedric Nugteren
2016-05-14
Set kernel arguments for AXPY as constant memory buffers, making it possible ...
Cedric Nugteren
2016-05-13
Initial experimental version of the half-precision HAXPY routine
Cedric Nugteren
2016-05-12
Initial changes in preparation for half-precision fp16 support
Cedric Nugteren
2016-05-02
Added tuning results for AMD Hawaii (R9 290X)
Cedric Nugteren
2016-05-01
Added tuning results for AMD Pitcairn (R9 270X)
Cedric Nugteren
2016-05-01
Updated tuning database for reduction/dot kernels based on the new tuner; par...
Cedric Nugteren
2016-05-01
Changed the index buffer of IxAMAX routines to unsigned int for proper buffer...
Cedric Nugteren
2016-05-01
Added a program cache (per-context) next to the per-device binary cache
Cedric Nugteren
2016-04-30
Added non-aboslute minimum counter-part IxMIN of the BLAS routine IxAMAX
Cedric Nugteren
2016-04-29
Added FillCache: a function to pre-compile all kernels for a specific device
Cedric Nugteren
2016-04-28
Fixed the cache to store binaries instead of OpenCL programs
Cedric Nugteren
2016-04-27
Added non-absolute counter-parts xSUM and IxMAX of the BLAS routines xASUM an...
Cedric Nugteren
2016-04-27
Added prototypes for non-BLAS routines: xSUM and IxMAX (non-absolute counterp...
Cedric Nugteren
2016-04-27
Moved all cache-related functions to a separate file; added a ClearCompiledPr...
Cedric Nugteren
2016-04-27
Added a '-verbose' option to the test binaries to report errors in more detai...
Cedric Nugteren
2016-04-27
All CLBlast enum constants now have the same raw values as in the cblas standard
Cedric Nugteren
2016-04-20
Added support for the iSAMAX/iDAMAX/iCAMAX/iZAMAX routines
cnugteren
2016-04-20
Added prototype for ixAMAX routines
cnugteren
2016-04-14
Added support for the SASUM/DASUM/ScASUM/DzASUM routines
cnugteren
2016-04-13
Added prototype for xASUM routines
cnugteren
2016-04-11
Fixed the way the defaults are calculated in the database; added warning for ...
cnugteren
2016-04-09
Events are now properly implemented using event waiting list and asking the u...
cnugteren
2016-04-02
Added support for testing (performance and correctness) against a CPU BLAS li...
cnugteren
2016-04-01
Added a wrapper for CBLAS libraries for performance/correctness testing
cnugteren
2016-03-30
Merge branch 'level1_routines' into development
cnugteren
[next]