summaryrefslogtreecommitdiff
path: root/src/clblast.cpp
AgeCommit message (Collapse)Author
2018-01-07Added API and tests for new GemmStridedBatched routineCedric Nugteren
2018-01-06Added CUDA interface to get temporary-buffer size for GEMM routineCedric Nugteren
2018-01-04Updated the generator script to automatically generate the temp-buffer codeCedric Nugteren
2017-12-30Added optional temp-buffer argument to C++ interface of GEMMCedric Nugteren
2017-12-28Added interface to compute the required temporary buffer size for GEMMCedric Nugteren
2017-10-08Moved non-routine-specific API functions and includes to separate filesCedric Nugteren
2017-10-07Synchronizes clpp11.h with CLCudaAPI 9.0Cedric Nugteren
2017-10-01Allow OverrideParameters function to work before a kernel was first usedCedric Nugteren
2017-09-24Updated database override function to work with the new database storage formatCedric Nugteren
2017-09-23Made database-caching no longer dependent on device name but on ↵Cedric Nugteren
device/platform IDs
2017-09-16Improved compilation time of the tuner databaseCedric Nugteren
2017-09-14Added architecture layer in the tuning database for better performance on ↵Cedric Nugteren
unseen devices
2017-09-06Split the database files over multiple directories and files; first step ↵Cedric Nugteren
towards separate compilation
2017-07-02Added interface and stubs for the im2col routineCedric Nugteren
2017-06-21Fixes some compilation issues related to the database structure changeCedric Nugteren
2017-05-26Fixes inability to run GEMM on multiple identical GPUs (issue #155)Kirill Mavreshko
2017-05-12Added the IxAMIN routines: absolute minimum version of IxAMAXCedric Nugteren
2017-04-10Removed const-vector-of-const-objects from the database class to remain ↵Cedric Nugteren
according to the C++11 standard
2017-03-10Added API and test infrastructure for the batched GEMM routineCedric Nugteren
2017-03-08Make batched routines based on offsets instead of a vector of cl_mem objects ↵Cedric Nugteren
- undoing many earlier changes
2017-03-05Added first naive version of the batched AXPY routineCedric Nugteren
2017-03-05Prepared generator for batched routines; added batched AXPY routine interfaceCedric Nugteren
2017-02-26Merge branch 'development' into triangular_solversCedric Nugteren
2017-02-26Removed half-precision support from the TRSM routine; too unstableCedric Nugteren
2017-02-16Added a C interface to the OverrideParameters function; added some in-line ↵Cedric Nugteren
comments to the API
2017-02-16Added input-sanity checks for the OverrideParameters functionCedric Nugteren
2017-02-13Added first version of the OverrideParameters functionCedric Nugteren
2017-02-05Merge branch 'development' into triangular_solversCedric Nugteren
2017-01-24Routine, Cache: generalize, reduce amount of copying in fast pathIvan Shapovalov
Implement a generalized Cache<K, V>. Two variants are provided: the first one is based on std::map, using C++14-specific transparent std::less<> and generalized std::map::find() to allow searching by tuple of references. The second one is based on std::vector and O(n) lookup, but remains C++11-compliant.
2017-01-24FillCache: perform compilation for each precision separatelyIvan Shapovalov
Thus do not prevent filling cache for float if the device does not support e. g. double.
2017-01-20treewide: include clpp11.hpp first to silence deprecation warningsIvan Shapovalov
Otherwise, cl.h gets included through clblast.h before clpp11.hpp.
2017-01-20Added prototype for the TRSV routineCedric Nugteren
2016-12-18Prepared for the addition of the TRSM triangular solver kernelCedric Nugteren
2016-10-22Routine: get rid of ::SetUp()Ivan Shapovalov
Since we now use C++ exceptions inside the implementation (and exceptions can be thrown from constructors), there is no need for a separate Routine::SetUp() function. For this, we also change the way how the kernel source string is constructed. The kernel-specific source code is now passed to the Routine ctor via an initializer_list of C strings to avoid unnecessary data copying while also working around C1091 of MSVC 2013.
2016-10-22treewide: use C++ exceptions properlyIvan Shapovalov
Since the codebase is designed around proper C++ idioms such as RAII, it makes sense to only use C++ exceptions internally instead of mixing exceptions and error codes. The exceptions are now caught at top level to preserve compatibility with the existing error code-based API. Note that we deliberately do not catch C++ runtime errors (such as `std::bad_alloc`) nor logic errors (aka failed assertions) because no actual handling can ever happen for such errors. However, in the C interface we do catch _all_ exceptions (...) and convert them into a wild-card error code.
2016-06-30Added declspec(dllexport) to ClearCache and FillCache, and added ↵Cedric Nugteren
declspec(dllimport) when not building the library
2016-06-19Renamed all C++ source files to .cpp to match the .hpp extension betterCedric Nugteren