Age | Commit message (Collapse) | Author |
|
|
|
API of CLBlast
|
|
|
|
buffer sizes
|
|
Conflicts:
scripts/generator/generator.py
scripts/generator/generator/routine.py
|
|
clashes with other projects
|
|
OpenCL functions
|
|
Since we now use C++ exceptions inside the implementation (and exceptions
can be thrown from constructors), there is no need for a separate
Routine::SetUp() function.
For this, we also change the way how the kernel source string is constructed.
The kernel-specific source code is now passed to the Routine ctor via
an initializer_list of C strings to avoid unnecessary data copying
while also working around C1091 of MSVC 2013.
|
|
Since the codebase is designed around proper C++ idioms such as RAII, it
makes sense to only use C++ exceptions internally instead of mixing
exceptions and error codes. The exceptions are now caught at top level
to preserve compatibility with the existing error code-based API.
Note that we deliberately do not catch C++ runtime errors (such as
`std::bad_alloc`) nor logic errors (aka failed assertions) because no
actual handling can ever happen for such errors.
However, in the C interface we do catch _all_ exceptions (...) and
convert them into a wild-card error code.
|
|
|
|
variable in a namespace and its container uses const-pointers to the actual data
|
|
|
|
|
|
|
|
|
|
|
|
now automatically taken from 32-bit if there are no entries at all
|
|
and convienient plain JSON/dict data-type
|
|
M370X GPU
|
|
best-performing cases for a specific parameters combination
|
|
|
|
|
|
handle duplicate entries of different runs
|
|
method as for known device groups
|
|
|
|
tuning results common for a device/vendor type
|
|
generation
|
|
|
|
|
|
now complies to the PEP8 style, added proper command-line argument parsing, and cleaned-up
|
|
|
|
|
|
declspec(dllimport) when not building the library
|
|
|
|
|
|
|
|
|
|
|
|
and/or transposing
|
|
routines
|
|
|
|
it to work under CTest properly
|
|
single-precision
|
|
single-precison
|
|
|
|
|
|
|
|
HGEMV/HGBMV/HHEMV/HHBMV/HHPMV/HSYMV/HSBMV/HSPMV/HTRMV/HTBMV/HTPMV
|
|
HSWAP/HSCAL/HCOPY/HAXPY/HDOT/HNRM2/HASUM/HSUM/iHAMAX/iHMAX/iHMIN
|
|
|