summaryrefslogtreecommitdiff
path: root/include/clblast.h
AgeCommit message (Collapse)Author
2020-05-12Added CLBLAST_VERSION_MAJOR/MINOR/PATCH defines in headers to store version ↵Cedric Nugteren
numbering
2018-11-12Add kernel_mode option to im2col, col2im, and convgemm functionsKoichi Akabe
2018-10-23Added groundwork for col2im algorithm plus first non-working version of ↵Cedric Nugteren
kernel and test
2018-07-29Removed complex numbers support for CONVGEMMCedric Nugteren
2018-05-05Added interface of batched convolution as GEMMCedric Nugteren
2018-03-10Fixed an issue for DLL linking under WindowsCedric Nugteren
2018-03-10Fixed a few things for the new tuning APICedric Nugteren
2018-03-10Completed the API for all tuneable kernelsCedric Nugteren
2018-03-09Added several more tuner API functionsCedric Nugteren
2018-03-06First version of the tuning API, added interface for copy-kernel, added sampleCedric Nugteren
2018-01-31Created the API and stubs for the HAD (hadamard-product) routinesCedric Nugteren
2018-01-11Added a RetrieveParameters function to inspect tuning parametersCedric Nugteren
2018-01-07Added API and tests for new GemmStridedBatched routineCedric Nugteren
2017-12-30Added optional temp-buffer argument to C++ interface of GEMMCedric Nugteren
2017-12-28Added interface to compute the required temporary buffer size for GEMMCedric Nugteren
2017-07-02Added interface and stubs for the im2col routineCedric Nugteren
2017-05-12Added the IxAMIN routines: absolute minimum version of IxAMAXCedric Nugteren
2017-04-07Added a special override database for the Apple CPU implementation on OS X: ↵Cedric Nugteren
this makes the test work, it does not focus on good performance
2017-03-10Added API and test infrastructure for the batched GEMM routineCedric Nugteren
2017-03-08Make batched routines based on offsets instead of a vector of cl_mem objects ↵Cedric Nugteren
- undoing many earlier changes
2017-03-05Added first naive version of the batched AXPY routineCedric Nugteren
2017-03-05Prepared generator for batched routines; added batched AXPY routine interfaceCedric Nugteren
2017-02-26Merge branch 'development' into triangular_solversCedric Nugteren
2017-02-26Removed half-precision support from the TRSM routine; too unstableCedric Nugteren
2017-02-18Fixed the naming of the C API of OverrideParameters and fixed the descriptionCedric Nugteren
2017-02-16Added a C interface to the OverrideParameters function; added some in-line ↵Cedric Nugteren
comments to the API
2017-02-16Added input-sanity checks for the OverrideParameters functionCedric Nugteren
2017-02-13Added first version of the OverrideParameters functionCedric Nugteren
2016-10-22Added extra error codes to reflect the more detailed error reporting of ↵Cedric Nugteren
OpenCL functions
2016-10-22treewide: use C++ exceptions properlyIvan Shapovalov
Since the codebase is designed around proper C++ idioms such as RAII, it makes sense to only use C++ exceptions internally instead of mixing exceptions and error codes. The exceptions are now caught at top level to preserve compatibility with the existing error code-based API. Note that we deliberately do not catch C++ runtime errors (such as `std::bad_alloc`) nor logic errors (aka failed assertions) because no actual handling can ever happen for such errors. However, in the C interface we do catch _all_ exceptions (...) and convert them into a wild-card error code.
2016-10-15Added documentation and minor refactoring for the recent support of static ↵Cedric Nugteren
library compilation
2016-10-14Fixes for static lib compilation on WindowsShehzan Mohammed
2016-06-30Added declspec(dllexport) to ClearCache and FillCache, and added ↵Cedric Nugteren
declspec(dllimport) when not building the library
2016-06-17Moved the test-for-valid-buffers function from the Routine class to separate ↵Cedric Nugteren
functions in a separate file
2016-06-16Added XOMATCOPY routines to perform out-of-place matrix scaling, copying, ↵Cedric Nugteren
and/or transposing
2016-05-25Added level-3 half-precision routines HGEMM/HSYMM/HSYRK/HSYR2K/HTRMMCedric Nugteren
2016-05-22Added level-2 half-precision routines HGER/HSYR/HSPR/HSYR2/HSPR2Cedric Nugteren
2016-05-22Added level-2 half-precision routines ↵Cedric Nugteren
HGEMV/HGBMV/HHEMV/HHBMV/HHPMV/HSYMV/HSBMV/HSPMV/HTRMV/HTBMV/HTPMV
2016-05-22Added level-1 half-precision routines ↵Cedric Nugteren
HSWAP/HSCAL/HCOPY/HAXPY/HDOT/HNRM2/HASUM/HSUM/iHAMAX/iHMAX/iHMIN
2016-05-13Initial experimental version of the half-precision HAXPY routineCedric Nugteren
2016-04-30Added non-aboslute minimum counter-part IxMIN of the BLAS routine IxAMAXCedric Nugteren
2016-04-29Added FillCache: a function to pre-compile all kernels for a specific deviceCedric Nugteren
2016-04-28Fixed the cache to store binaries instead of OpenCL programsCedric Nugteren
2016-04-27Added non-absolute counter-parts xSUM and IxMAX of the BLAS routines xASUM ↵Cedric Nugteren
and IxAMAX
2016-04-27Added prototypes for non-BLAS routines: xSUM and IxMAX (non-absolute ↵Cedric Nugteren
counterparts of xASUM and IxAMAX)
2016-04-27Moved all cache-related functions to a separate file; added a ↵Cedric Nugteren
ClearCompiledProgramCache function to clear the cache
2016-04-27All CLBlast enum constants now have the same raw values as in the cblas standardCedric Nugteren
2016-04-20Added prototype for ixAMAX routinescnugteren
2016-04-13Added prototype for xASUM routinescnugteren
2016-04-01Added a wrapper for CBLAS libraries for performance/correctness testingcnugteren