Age | Commit message (Collapse) | Author | |
---|---|---|---|
2023-05-07 | AMAX/AMIN integer testing and bug fixes (#457) | Cedric Nugteren | |
* Fixed a bug in XAMAX/XMIN routines that caused the increment and offset to be included in the result * Perform proper integer-output testing in XAMAX tests * A few changes towards getting it ready for a PR * Also fix compilation for clBLAS and cuBLAS references * Fix a bug that would only use the real part of complex numbers in the amax/amin routines * A few small fixes related to the AMAX tests | |||
2018-08-05 | Added an option to compile the Netlib API with static OpenCL device and context | Cedric Nugteren | |
2018-01-07 | Added API and tests for new GemmStridedBatched routine | Cedric Nugteren | |
2018-01-06 | Fixed a minor nullptr related issue in the code generator | Cedric Nugteren | |
2018-01-04 | Added a CUDA version of the GEMM temp-buffer optional argument | Cedric Nugteren | |
2018-01-04 | Updated the generator script to automatically generate the temp-buffer code | Cedric Nugteren | |
2017-10-14 | Various fixes to make the host code and sample compile with the CUDA API | Cedric Nugteren | |
2017-10-12 | CUDA API now takes context and device in instead of stream | Cedric Nugteren | |
2017-10-11 | Added first (untested) version of a CUDA API | Cedric Nugteren | |
2017-10-09 | Fixed the Python generator script w.r.t. the recent change of testing ↵ | Cedric Nugteren | |
direct/in-direct GEMM kernels separately | |||
2017-06-25 | Fixed some Clang and MSVC warnings | Cedric Nugteren | |
2017-04-13 | Fixed CUDA malloc and cuBLAS handles: cuBLAS as a performance-reference now ↵ | Cedric Nugteren | |
works | |||
2017-04-11 | Made compilation of the cuBLAS wrapper work properly | Cedric Nugteren | |
2017-04-06 | Completed the cuBLAS wrapper | Cedric Nugteren | |
2017-04-05 | Added a first version of a cuBLAS wrapper (WIP) | Cedric Nugteren | |
2017-04-03 | In-lined the float2 and double2 types to avoid collision with CUDA's definitions | Cedric Nugteren | |
2017-03-05 | Prepared generator for batched routines; added batched AXPY routine interface | Cedric Nugteren | |
2016-11-27 | Made it possible to use the command-line environmental variables for each ↵ | Cedric Nugteren | |
executable and without re-running CMake | |||
2016-11-22 | Minor changes to ensure full compatibility with the Netlib CBLAS API | Cedric Nugteren | |
2016-11-20 | Made functions with scalar-buffers as output properly return values | Cedric Nugteren | |
2016-10-25 | Renamed the include and source files of the Netlib CBLAS API | Cedric Nugteren | |
2016-10-25 | Removed the clblast namespace from the Netlib C API source file to ensure ↵ | Cedric Nugteren | |
proper linking | |||
2016-10-25 | Made the Netlib CBLAS API use the same enums with prefixes as the regular C ↵ | Cedric Nugteren | |
API of CLBlast | |||
2016-10-25 | Added initial version of a Netlib CBLAS implementation. TODO: Set correct ↵ | Cedric Nugteren | |
buffer sizes | |||
2016-10-25 | Merge branch 'development' into netlib_blas_api | Cedric Nugteren | |
Conflicts: scripts/generator/generator.py scripts/generator/generator/routine.py | |||
2016-10-22 | All enums in the C API are now prefixed with CLBlast to avoid potential name ↵ | Cedric Nugteren | |
clashes with other projects | |||
2016-10-22 | Routine: get rid of ::SetUp() | Ivan Shapovalov | |
Since we now use C++ exceptions inside the implementation (and exceptions can be thrown from constructors), there is no need for a separate Routine::SetUp() function. For this, we also change the way how the kernel source string is constructed. The kernel-specific source code is now passed to the Routine ctor via an initializer_list of C strings to avoid unnecessary data copying while also working around C1091 of MSVC 2013. | |||
2016-10-22 | treewide: use C++ exceptions properly | Ivan Shapovalov | |
Since the codebase is designed around proper C++ idioms such as RAII, it makes sense to only use C++ exceptions internally instead of mixing exceptions and error codes. The exceptions are now caught at top level to preserve compatibility with the existing error code-based API. Note that we deliberately do not catch C++ runtime errors (such as `std::bad_alloc`) nor logic errors (aka failed assertions) because no actual handling can ever happen for such errors. However, in the C interface we do catch _all_ exceptions (...) and convert them into a wild-card error code. | |||
2016-10-05 | Added first version of Netlib BLAS API header | Cedric Nugteren | |
2016-09-04 | Refactored the Python C++ generator script; now confirms to the PEP8 styleguide | Cedric Nugteren | |