summaryrefslogtreecommitdiff
path: root/src/cupp11.hpp
AgeCommit message (Collapse)Author
2022-05-23Fix API inconsistency in cupp11.hppCedric Nugteren
The function `CopyToAsync` has an optional event argument in the OpenCL version, which is used in CLBlast. This makes the code not compile at all if CUDA (through cupp11.hpp`) is used as backend. This issue was found by a CLBlast user and reported privately by email. This PR should fix that.
2018-07-14Applied feedback from Cedric from first pull requestTyler Sorensen
2018-06-03Fixes for CUDA version of CLBlastCedric Nugteren
2017-12-30Added optional temp-buffer argument to C++ interface of GEMMCedric Nugteren
2017-12-09Made the pre-processor run by default for ARM and Qualcomm GPUsCedric Nugteren
2017-11-20Potentially fixed an MSVC 2013 issue with a copy-constructor not being generatedCedric Nugteren
2017-11-19Some fixed for the new auto-tuner to be compatible with the Python scriptsCedric Nugteren
2017-10-18Moved CUmodule code from Kernel to Program class to not require ↵Cedric Nugteren
re-compilation every time
2017-10-17Fix an incompatibility with CUDA's FP16 definitionCedric Nugteren
2017-10-16Made all CUDA kernel launches synchronous; removed exception raisingCedric Nugteren
2017-10-15Added the SM-compute-arch version to the nv compile optionsCedric Nugteren
2017-10-15Various fixes to make the first CUDA examples workCedric Nugteren
2017-10-14Various fixes to make the host code and sample compile with the CUDA APICedric Nugteren
2017-10-11Added first (untested) version of a CUDA APICedric Nugteren