summaryrefslogtreecommitdiff
path: root/src/clpp11.hpp
AgeCommit message (Collapse)Author
2017-10-07Synchronizes clpp11.h with CLCudaAPI 9.0Cedric Nugteren
2017-09-23Made database-caching no longer dependent on device name but on ↵Cedric Nugteren
device/platform IDs
2017-09-16Fixed an issue with the NVIDIA compute capability not being retrieved properlyCedric Nugteren
2017-09-14Added a guard against missing AMD and NVIDIA extensionsCedric Nugteren
2017-09-10Added the new vendor-architecture-name hierarchy to the tuners as wellCedric Nugteren
2017-09-08Introduced the notion of a device-architecture for the database and added ↵Cedric Nugteren
device and architecture name mappings
2017-04-07Added a special override database for the Apple CPU implementation on OS X: ↵Cedric Nugteren
this makes the test work, it does not focus on good performance
2017-03-08Make batched routines based on offsets instead of a vector of cl_mem objects ↵Cedric Nugteren
- undoing many earlier changes
2017-01-24Routine, Cache: generalize, reduce amount of copying in fast pathIvan Shapovalov
Implement a generalized Cache<K, V>. Two variants are provided: the first one is based on std::map, using C++14-specific transparent std::less<> and generalized std::map::find() to allow searching by tuple of references. The second one is based on std::vector and O(n) lookup, but remains C++11-compliant.
2017-01-24src/clpp11.hpp: check pointers before clRelease*()Ivan Shapovalov
This is to avoid spurious "induced" errors on destruction, if construction failed for some reason.
2017-01-24src/clpp11.hpp: do not store program source/binary in ProgramIvan Shapovalov
The stored source/binary does not seem to serve any purpose, yet its presence makes Program a heavy (not pure refcounted) object, which is undesired esp. because it is copied from the cache in the hot path.
2016-11-20Forced OpenCL 1.1 compilation and disabled a deprecation warningCedric Nugteren
2016-10-22treewide: use C++ exceptions properlyIvan Shapovalov
Since the codebase is designed around proper C++ idioms such as RAII, it makes sense to only use C++ exceptions internally instead of mixing exceptions and error codes. The exceptions are now caught at top level to preserve compatibility with the existing error code-based API. Note that we deliberately do not catch C++ runtime errors (such as `std::bad_alloc`) nor logic errors (aka failed assertions) because no actual handling can ever happen for such errors. However, in the C interface we do catch _all_ exceptions (...) and convert them into a wild-card error code.
2016-10-22src/clpp11.hpp: avoid throwing exceptions from std::shared_ptr's DeleterIvan Shapovalov
2016-10-22src/clpp11.hpp: GetInfoString: avoid reallocationIvan Shapovalov
2016-10-22src/clpp11.hpp: reinstate error checking on clGetEventProfilingInfo()Ivan Shapovalov
2016-09-27Updated to version 8.0 of the CLCudaAPI headerCedric Nugteren
2016-07-22clblast::RunKernel, cl::Kernel: unify variants with/without waitForEvents, ↵Ivan Shapovalov
support empty LWS
2016-07-22cl::Kernel: skip NULL entries in waitForEventsIvan Shapovalov
2016-07-22clblast::RunKernel, cl::Kernel: take const vector as waitForEventsIvan Shapovalov
2016-07-16Fixed some more types and type conversions in the clpp11 interface to OpenCLCedric Nugteren
2016-07-13Make sure the passed types are large enough.Gian-Carlo Pascutto
Make sure all out parameters that are passed to functions such as clGetDeviceInfo are large enough to contain the replies.
2016-07-06Added a VERBOSE mode to debug performance: now prints details about ↵Cedric Nugteren
compilation and kernel execution to screen
2016-07-02Ensure clGetKernelWorkGroupInfo return value fits.Gian-Carlo Pascutto
In LocalMemUsage(), there's a first call to clGetKernelWorkGroupInfo to get the "bytes" amount needed to store the result from CL_KERNEL_LOCAL_MEM_SIZE. However, the actual value passed is an "auto result = size_t", which in 32-bit mode is 4 bytes, regardless of the previous return value. The spec describes that it will actually be a cl_ulong which is 8 bytes. To prevent stack corruption, make sure we are in fact passing a cl_ulong. Also adjust all callers to take the changed type into account.
2016-07-02Fixed some memory leaks related to events not properly cleaned-upCedric Nugteren
2016-06-29Updated to version 6.0 of the CLCudaAPI headerCedric Nugteren
2016-06-18Moved all headers into the source tree, changed headers to .hpp extensionCedric Nugteren