Age | Commit message (Collapse) | Author |
|
loops in kernel accordingly
|
|
changed transpose kernel code
|
|
NVIDIA and ARM GPUs
|
|
sets of input parameters
|
|
|
|
kernel
|
|
for the GEMM direct kernels
|
|
incomplete rectangles
|
|
transposing/non-transposing: NN, NT, TN, TT
|
|
to 256-256-256
|