Age | Commit message (Collapse) | Author |
|
executable and without re-running CMake
|
|
|
|
incomplete rectangles
|
|
|
|
transposing/non-transposing: NN, NT, TN, TT
|
|
to 256-256-256
|
|
|
|
default for the GEMM direct kernel
|
|
|
|
|
|
NWGD and KWGD into one WGD parameter
|
|
can't handle long strings
|
|
explored exhaustively and a larger set which is explored randomly
|
|
search space to have a better chance to evaluate more likely parameter combinations
|
|
|
|
|
|
|
|
enabling better memory performance
|
|
case of fp16 arguments are cast on host and in kernel
|
|
|
|
|