Age | Commit message (Collapse) | Author | |
---|---|---|---|
2017-01-24 | FillCache: perform compilation for each precision separately | Ivan Shapovalov | |
Thus do not prevent filling cache for float if the device does not support e. g. double. | |||
2017-01-24 | Routine: fix semi-warm routine construction (when binary is in cache) | Ivan Shapovalov | |
There was a missing return statement in the semi-warm path that made CLBlast to continue to cold path after a cache hit. | |||
2017-01-24 | src/clpp11.hpp: check pointers before clRelease*() | Ivan Shapovalov | |
This is to avoid spurious "induced" errors on destruction, if construction failed for some reason. | |||
2017-01-24 | src/clpp11.hpp: do not store program source/binary in Program | Ivan Shapovalov | |
The stored source/binary does not seem to serve any purpose, yet its presence makes Program a heavy (not pure refcounted) object, which is undesired esp. because it is copied from the cache in the hot path. | |||
2017-01-20 | treewide: include clpp11.hpp first to silence deprecation warnings | Ivan Shapovalov | |
Otherwise, cl.h gets included through clblast.h before clpp11.hpp. | |||
2017-01-20 | Routine: use PrecisionSupported<>() instead of duplicating the check | Ivan Shapovalov | |
2017-01-19 | Added tuning results for NVIDIA GTX 1080 and Intel Core i7-4790K | Cedric Nugteren | |
2017-01-07 | Always enables cl_khr_fp64 when running double-precision, not just for ↵ | Cedric Nugteren | |
OpenCL 1.1 or lower | |||
2017-01-03 | Added tuning results for the AMD Turks GPU and the Intel Core i7-2670QM CPU | Cedric Nugteren | |
2016-12-18 | Fixed a bug when using offsets in the direct GEMM kernels | Cedric Nugteren | |
2016-11-29 | Made Intel GPUs always use the indirect version of the GEMM kernel | Cedric Nugteren | |
2016-11-27 | Made it possible to use the command-line environmental variables for each ↵ | Cedric Nugteren | |
executable and without re-running CMake | |||
2016-11-26 | Improved the default parameters for cases with non-common parameters across ↵ | Cedric Nugteren | |
all devices | |||
2016-11-24 | Merge pull request #125 from CNugteren/netlib_blas_api | Cedric Nugteren | |
Netlib CBLAS API for CLBlast | |||
2016-11-23 | Fixed a vector-size related bug in the CLBlast Netlib API | Cedric Nugteren | |
2016-11-23 | Fixed a bug in the HSCAL routine | Cedric Nugteren | |
2016-11-22 | Minor changes to ensure full compatibility with the Netlib CBLAS API | Cedric Nugteren | |
2016-11-20 | Made functions with scalar-buffers as output properly return values | Cedric Nugteren | |
2016-11-20 | Now correctly tests for validaty of the B matrix in the TRMM routine | Cedric Nugteren | |
2016-11-20 | Forced OpenCL 1.1 compilation and disabled a deprecation warning | Cedric Nugteren | |
2016-11-20 | Fixed a bug in the TRMM routine caused by overwriting input data before ↵ | Cedric Nugteren | |
consuming everything | |||
2016-11-19 | Changed the GEMM kernel selection parameters for Skylake GPUs to always ↵ | Cedric Nugteren | |
favour the regular kernel | |||
2016-11-15 | Updated the tuning results for the Intel Skylake ULT GT2 GPU | Cedric Nugteren | |
2016-10-25 | Renamed the include and source files of the Netlib CBLAS API | Cedric Nugteren | |
2016-10-25 | Removed the clblast namespace from the Netlib C API source file to ensure ↵ | Cedric Nugteren | |
proper linking | |||
2016-10-25 | Fixed some issues preventing the Netlib CBLAS API from linking correctly | Cedric Nugteren | |
2016-10-25 | Made the Netlib CBLAS API use the same enums with prefixes as the regular C ↵ | Cedric Nugteren | |
API of CLBlast | |||
2016-10-25 | Sets the proper sizes for the buffers for the Netlib CBLAS API | Cedric Nugteren | |
2016-10-25 | Added initial version of a Netlib CBLAS implementation. TODO: Set correct ↵ | Cedric Nugteren | |
buffer sizes | |||
2016-10-24 | Added tuning results for GeForce GTX TITAN Black | Cedric Nugteren | |
2016-10-23 | Fixed a bug in the transpose-matrix function | Cedric Nugteren | |
2016-10-23 | Removed PUBLIC_API from the C++ exception classes | Cedric Nugteren | |
2016-10-23 | Added a fix for compilation under Visual Studio 2013 related to the new ↵ | Cedric Nugteren | |
exception classes | |||
2016-10-22 | Added tuning results for the AMD Tonga GPU | Cedric Nugteren | |
2016-10-22 | All enums in the C API are now prefixed with CLBlast to avoid potential name ↵ | Cedric Nugteren | |
clashes with other projects | |||
2016-10-22 | Moved files around a bit; created a utilities subfolder | Cedric Nugteren | |
2016-10-22 | Added documentation for the better exception handling | Cedric Nugteren | |
2016-10-22 | Merge pull request #117 from intelfx/exceptions | Cedric Nugteren | |
Convert to use C++ exceptions internally | |||
2016-10-22 | Fixed a bug in the SYRK/SYR2K/HERK/HER2K routines that would occur with ↵ | Cedric Nugteren | |
specific tuning parameters (2) | |||
2016-10-22 | Fixed a bug in the SYRK/SYR2K/HERK/HER2K routines that would occur with ↵ | Cedric Nugteren | |
specific tuning parameters | |||
2016-10-22 | Routine: get rid of ::SetUp() | Ivan Shapovalov | |
Since we now use C++ exceptions inside the implementation (and exceptions can be thrown from constructors), there is no need for a separate Routine::SetUp() function. For this, we also change the way how the kernel source string is constructed. The kernel-specific source code is now passed to the Routine ctor via an initializer_list of C strings to avoid unnecessary data copying while also working around C1091 of MSVC 2013. | |||
2016-10-22 | treewide: use C++ exceptions properly | Ivan Shapovalov | |
Since the codebase is designed around proper C++ idioms such as RAII, it makes sense to only use C++ exceptions internally instead of mixing exceptions and error codes. The exceptions are now caught at top level to preserve compatibility with the existing error code-based API. Note that we deliberately do not catch C++ runtime errors (such as `std::bad_alloc`) nor logic errors (aka failed assertions) because no actual handling can ever happen for such errors. However, in the C interface we do catch _all_ exceptions (...) and convert them into a wild-card error code. | |||
2016-10-22 | src/clpp11.hpp: avoid throwing exceptions from std::shared_ptr's Deleter | Ivan Shapovalov | |
2016-10-22 | src/clpp11.hpp: GetInfoString: avoid reallocation | Ivan Shapovalov | |
2016-10-22 | src/clpp11.hpp: reinstate error checking on clGetEventProfilingInfo() | Ivan Shapovalov | |
2016-10-21 | Merge pull request #118 from matze/add-pkg-config | Cedric Nugteren | |
Generate and install pkg-config description | |||
2016-10-21 | Generate and install pkg-config description | Matthias Vogelgesang | |
2016-10-14 | Fixed an issue with a growing database: the database is now a global ↵ | Cedric Nugteren | |
variable in a namespace and its container uses const-pointers to the actual data | |||
2016-10-13 | Added tuning results for Intel HD Graphics IvyBridge GPU | Cedric Nugteren | |
2016-10-12 | Removed a spurious #ifdef | Cedric Nugteren | |