Age | Commit message (Collapse) | Author |
|
|
|
|
|
where the AMD compiler crashes"
This reverts commit 407ed52cec41445f02e85cb45d08f590960216bb.
|
|
|
|
|
|
AMD compiler crashes
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
names in kernels
|
|
|
|
loops in kernel accordingly
|
|
|
|
pragma for several kernels
|
|
|
|
changed transpose kernel code
|
|
|
|
optional temporary buffers
|
|
NVIDIA and ARM GPUs
|
|
sets of input parameters
|
|
|
|
|
|
|
|
GEMM kernel
|
|
|
|
|
|
|
|
|
|
|
|
TRSM
|
|
|
|
specific tuning parameters (2)
|
|
specific tuning parameters
|
|
kernel
|
|
for the GEMM direct kernels
|
|
incomplete rectangles
|
|
transposing/non-transposing: NN, NT, TN, TT
|
|
to 256-256-256
|
|
|
|
NWGD and KWGD into one WGD parameter
|
|
indirect version
|
|
|
|
|
|
can't handle long strings
|
|
problems if C contains NaNs
|
|
dvasschemacq-master
Conflicts:
src/kernels/level1/xaxpy.opencl
src/kernels/level2/xgemv.opencl
src/kernels/level2/xgemv_fast.opencl
src/kernels/level2/xger.opencl
src/kernels/level2/xher.opencl
src/kernels/level2/xher2.opencl
src/kernels/level3/xgemm_part2.opencl
|