Age | Commit message (Collapse) | Author |
|
Replace the looped test by a single one with the offset of the last batch.
|
|
Replace the looped test by a single one with the maximal found offset.
|
|
Convolution with single kernel
|
|
strided-batched-GEMM routine
|
|
|
|
|
|
|
|
|
|
|
|
kernel and test
|
|
|
|
|
|
|
|
|
|
|
|
|
|
transposing
|
|
|
|
barriers are present
|
|
TRSV global worksize issue
|
|
|
|
|
|
< 16 LWGS for TSRV and TRSM
|
|
size
|
|
approach for convgemm
|
|
|
|
local memory support now
|
|
|
|
edge cases now
|
|
gemm kernel
|
|
a new kernel
|
|
|
|
|
|
|
|
|
|
|
|
|
|
with the new kernel
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
GEMM routine
|
|
|
|
|
|
|
|
|