summaryrefslogtreecommitdiff
path: root/src
AgeCommit message (Collapse)Author
2016-10-25Added initial version of a Netlib CBLAS implementation. TODO: Set correct ↵Cedric Nugteren
buffer sizes
2016-10-24Added tuning results for GeForce GTX TITAN BlackCedric Nugteren
2016-10-23Fixed a bug in the transpose-matrix functionCedric Nugteren
2016-10-23Removed PUBLIC_API from the C++ exception classesCedric Nugteren
2016-10-23Added a fix for compilation under Visual Studio 2013 related to the new ↵Cedric Nugteren
exception classes
2016-10-22Added tuning results for the AMD Tonga GPUCedric Nugteren
2016-10-22All enums in the C API are now prefixed with CLBlast to avoid potential name ↵Cedric Nugteren
clashes with other projects
2016-10-22Moved files around a bit; created a utilities subfolderCedric Nugteren
2016-10-22Added documentation for the better exception handlingCedric Nugteren
2016-10-22Merge pull request #117 from intelfx/exceptionsCedric Nugteren
Convert to use C++ exceptions internally
2016-10-22Fixed a bug in the SYRK/SYR2K/HERK/HER2K routines that would occur with ↵Cedric Nugteren
specific tuning parameters (2)
2016-10-22Fixed a bug in the SYRK/SYR2K/HERK/HER2K routines that would occur with ↵Cedric Nugteren
specific tuning parameters
2016-10-22Routine: get rid of ::SetUp()Ivan Shapovalov
Since we now use C++ exceptions inside the implementation (and exceptions can be thrown from constructors), there is no need for a separate Routine::SetUp() function. For this, we also change the way how the kernel source string is constructed. The kernel-specific source code is now passed to the Routine ctor via an initializer_list of C strings to avoid unnecessary data copying while also working around C1091 of MSVC 2013.
2016-10-22treewide: use C++ exceptions properlyIvan Shapovalov
Since the codebase is designed around proper C++ idioms such as RAII, it makes sense to only use C++ exceptions internally instead of mixing exceptions and error codes. The exceptions are now caught at top level to preserve compatibility with the existing error code-based API. Note that we deliberately do not catch C++ runtime errors (such as `std::bad_alloc`) nor logic errors (aka failed assertions) because no actual handling can ever happen for such errors. However, in the C interface we do catch _all_ exceptions (...) and convert them into a wild-card error code.
2016-10-22src/clpp11.hpp: avoid throwing exceptions from std::shared_ptr's DeleterIvan Shapovalov
2016-10-22src/clpp11.hpp: GetInfoString: avoid reallocationIvan Shapovalov
2016-10-22src/clpp11.hpp: reinstate error checking on clGetEventProfilingInfo()Ivan Shapovalov
2016-10-21Merge pull request #118 from matze/add-pkg-configCedric Nugteren
Generate and install pkg-config description
2016-10-21Generate and install pkg-config descriptionMatthias Vogelgesang
2016-10-14Fixed an issue with a growing database: the database is now a global ↵Cedric Nugteren
variable in a namespace and its container uses const-pointers to the actual data
2016-10-13Added tuning results for Intel HD Graphics IvyBridge GPUCedric Nugteren
2016-10-12Removed a spurious #ifdefCedric Nugteren
2016-10-12Fixed missing line endingCedric Nugteren
2016-10-10Added support for compiling the library, the client, and the samples under ↵Cedric Nugteren
MSVC 2013
2016-10-10Fixed an issue with const members of structs in the databaseCedric Nugteren
2016-10-10Fixed an issue with the length of the GEMM OpenCL string for both MSVC 2013 ↵Cedric Nugteren
and 2015
2016-10-10First fixes towards compilation on Visual Studio 2013Cedric Nugteren
2016-10-10Updated the tuning results for the GTX 750 Ti GPUCedric Nugteren
2016-10-10Changed the thresholds for the direct/indirect GEMM kernels for NVIDIA and ↵Cedric Nugteren
Intel GPUs
2016-10-08Fixed a performance bug for Intel Iris Pro GPUs due to incorrect tuning resultsCedric Nugteren
2016-10-06Added first tuning results for the single-kernel direct GEMM implementationCedric Nugteren
2016-10-06Added a kernel selection database to select between the direct and indirect ↵Cedric Nugteren
GEMM kernels
2016-10-03Fixed a const-correctness issue with complex conjugation in the GEMM direct ↵Cedric Nugteren
kernel
2016-10-03Added functions to load from off-chip to local memory without vector loads ↵Cedric Nugteren
for the GEMM direct kernels
2016-10-03Re-organised GEMM direct kernel and added faster fall-back version for ↵Cedric Nugteren
incomplete rectangles
2016-10-02Set the default number of runs for all kernels to at least 2 runsCedric Nugteren
2016-10-02Specialised the GEMM direct kernel in four ways for ↵Cedric Nugteren
transposing/non-transposing: NN, NT, TN, TT
2016-10-02Split the GEMM direct kernel into two files; set the default tuning target ↵Cedric Nugteren
to 256-256-256
2016-10-01Added padding to the local memory of the GEMM direct kernelCedric Nugteren
2016-10-01Added default num-runs to the tuner adding averaging over 10 runs as a ↵Cedric Nugteren
default for the GEMM direct kernel
2016-10-01Merge branch 'development' into gemm_directCedric Nugteren
2016-09-27Added an option to run tuned kernels multiple times to average execution ↵Cedric Nugteren
times; requires CLTune 2.5.0
2016-09-27Updated to version 8.0 of the CLCudaAPI headerCedric Nugteren
2016-09-27Fixed the local memory size computation for the GEMM tunersCedric Nugteren
2016-09-27Now generates test/client/tuner data using a fixed seed to enable ↵Cedric Nugteren
reproducability of results
2016-09-25Added a first version of a tuner for the GEMM direct kernel; collapsed MWGD, ↵Cedric Nugteren
NWGD and KWGD into one WGD parameter
2016-09-25Separated the tuning parameters of the new direct GEMM kernel from the ↵Cedric Nugteren
indirect version
2016-09-25Added a first version of the direct version of GEMM with local memoryCedric Nugteren
2016-09-21Merge branch 'development' into gemm_directCedric Nugteren
2016-09-21It is now possible to set the OpenCL compiler options through an ↵Cedric Nugteren
environmental variable