From 91dbd580ab2f5d2363d51ba4e3fc9735f1c7a937 Mon Sep 17 00:00:00 2001 From: Cedric Nugteren Date: Sat, 15 Sep 2018 18:43:51 +0200 Subject: Added a kernel-parameter pair table to document the tuning API --- doc/api.md | 4 ++-- doc/tuning.md | 20 ++++++++++++++++++++ 2 files changed, 22 insertions(+), 2 deletions(-) (limited to 'doc') diff --git a/doc/api.md b/doc/api.md index a60e16ce..7bd2abf2 100644 --- a/doc/api.md +++ b/doc/api.md @@ -3452,7 +3452,7 @@ Arguments to FillCache: RetrieveParameters: Retrieves current tuning parameters (auxiliary function) ------------- -This function retrieves current tuning parameters for a specific device-precision-kernel combination. This can be used for debugging or inspection. +This function retrieves current tuning parameters for a specific device-precision-kernel combination. This can be used for debugging or inspection. See [tuning.md](tuning.md) for more details on which kernel names and parameters are valid. C++ API: ``` @@ -3475,7 +3475,7 @@ Arguments to RetrieveParameters (C++ version): OverrideParameters: Override tuning parameters (auxiliary function) ------------- -This function overrides tuning parameters for a specific device-precision-kernel combination. The next time the target routine is called it will be re-compiled and use the new parameters. All further times (until `OverrideParameters` is called again) it will load the kernel from the cache and thus continue to use the new parameters. Note that the first time after calling `OverrideParameters` a performance drop can be observable due to the re-compilation of the kernel. +This function overrides tuning parameters for a specific device-precision-kernel combination. The next time the target routine is called it will be re-compiled and use the new parameters. All further times (until `OverrideParameters` is called again) it will load the kernel from the cache and thus continue to use the new parameters. Note that the first time after calling `OverrideParameters` a performance drop can be observable due to the re-compilation of the kernel. See [tuning.md](tuning.md) for more details on which kernel names and parameters are valid. C++ API: ``` diff --git a/doc/tuning.md b/doc/tuning.md index 938c3b6a..3117ffad 100644 --- a/doc/tuning.md +++ b/doc/tuning.md @@ -195,6 +195,26 @@ To inspect current behaviour, you can also retrieve the parameters for a specifi const Precision precision, std::unordered_map ¶meters) +These two functions require/retrieve the parameters as given in [src/database/kernels](../src/database/kernels), i.e.: + +| Kernel name | Parameters | +| --------------------|-----------------------| +| Xaxpy | VW, WGS, WPT | +| Xdot | WGS1, WGS2 | +| Xgemv | WGS1, WPT1, UNROLL1 | +| XgemvFast | VW2, WGS2, WPT2 | +| XgemvFastRot | VW3, WGS3, WPT3 | +| Xger | WGS1, WGS2, WPT | +| Xtrsv | TRSV_BLOCK_SIZE | +| Xgemm | GEMMK, KREG, KWG, KWI, MDIMA, MDIMC, MWG, NDIMB, NDIMC, NWG, SA, SB, STRM, STRN, VWM, VWN | +| XgemmDirect | KWID, MDIMAD, MDIMCD, NDIMBD, NDIMCD, PADA, PADB, VWMD, VWND, WGD | +| Copy | COPY_DIMX, COPY_DIMY, COPY_VW, COPY_WPT | +| Pad | PAD_DIMX, PAD_DIMY, PAD_WPTX, PAD_WPTY | +| Transpose | TRA_DIM, TRA_PAD, TRA_SHUFFLE, TRA_WPT | +| Padtranspose | PADTRA_PAD, PADTRA_TILE, PADTRA_WPT | +| Invert | INTERNAL_BLOCK_SIZE | +| TrsvRoutine | TRSV_BLOCK_SIZE | + Tuning OpenCL compiler options ------------- -- cgit v1.2.3