Age | Commit message (Collapse) | Author |
|
|
|
Eliminate a temporary Program object
|
|
This was causing a crash for me because the temporary Program destructor called
clReleaseProgram on the cl_program with Program, and then clBuildProgram was
called on the same cl_program (belonging to the Program owned by the
shared_ptr, but it's the same cl_program).
|
|
Disabled calls to clReleaseProgram under Windows
|
|
OpenCL driver unloads first
|
|
|
|
|
|
|
|
comparisons
|
|
barriers are present
|
|
TRSV global worksize issue
|
|
|
|
|
|
test from README
|
|
Apple opencl limitations for TRSV/TRSM now return not-implemented status
|
|
Runtime statistics in client
|
|
< 16 LWGS for TSRV and TRSM
|
|
size
|
|
|
|
and standard-deviation
|
|
Added an option to run the routine tuner for a single specific GEMM size
|
|
|
|
Routine tuners read kernel JSON from disk
|
|
|
|
available; now run part of alltuners target
|
|
|
|
Canary buffer overflow protection
|
|
|
|
Better cache behaviour of OpenCL programs
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Update ci links to use doman names and build names instead of IP/id
|
|
Updates the README badges to point to the domain name instead of
IP addresses. Also updates the names of the builds to the name
of the build instead of the id of the build.
|
|
|
|
|
|
Intel subgroup shuffling
|
|
|
|
|
|
the OpenCL program
|
|
|
|
|
|
Added 2D-register-caching GEMM kernel
|
|
|
|
|
|
|
|
with the new kernel
|