summaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
authorCedric Nugteren <web@cedricnugteren.nl>2018-09-16 20:01:18 +0200
committerCedric Nugteren <web@cedricnugteren.nl>2018-09-16 20:01:18 +0200
commit83ba3d4b7ba3a9cb5fbd2c1ad2bb14b2addd39fb (patch)
tree58900a63158d08e76342b46372fcc59015b4d3ca /doc
parentb7d833901213d03fe5e7f10c15741f55c6c1eb54 (diff)
parentc163868e1822a97750b4380f0d9cdd38369f9f0b (diff)
Merge branch 'master' into convgemm_multi_kernel
Diffstat (limited to 'doc')
-rw-r--r--doc/api.md4
-rw-r--r--doc/bindings.md6
-rw-r--r--doc/tuning.md20
3 files changed, 28 insertions, 2 deletions
diff --git a/doc/api.md b/doc/api.md
index 02bca018..15bc0dcd 100644
--- a/doc/api.md
+++ b/doc/api.md
@@ -3512,7 +3512,7 @@ Arguments to FillCache:
RetrieveParameters: Retrieves current tuning parameters (auxiliary function)
-------------
-This function retrieves current tuning parameters for a specific device-precision-kernel combination. This can be used for debugging or inspection.
+This function retrieves current tuning parameters for a specific device-precision-kernel combination. This can be used for debugging or inspection. See [tuning.md](tuning.md) for more details on which kernel names and parameters are valid.
C++ API:
```
@@ -3535,7 +3535,7 @@ Arguments to RetrieveParameters (C++ version):
OverrideParameters: Override tuning parameters (auxiliary function)
-------------
-This function overrides tuning parameters for a specific device-precision-kernel combination. The next time the target routine is called it will be re-compiled and use the new parameters. All further times (until `OverrideParameters` is called again) it will load the kernel from the cache and thus continue to use the new parameters. Note that the first time after calling `OverrideParameters` a performance drop can be observable due to the re-compilation of the kernel.
+This function overrides tuning parameters for a specific device-precision-kernel combination. The next time the target routine is called it will be re-compiled and use the new parameters. All further times (until `OverrideParameters` is called again) it will load the kernel from the cache and thus continue to use the new parameters. Note that the first time after calling `OverrideParameters` a performance drop can be observable due to the re-compilation of the kernel. See [tuning.md](tuning.md) for more details on which kernel names and parameters are valid.
C++ API:
```
diff --git a/doc/bindings.md b/doc/bindings.md
index 3bd3fc7b..85508e68 100644
--- a/doc/bindings.md
+++ b/doc/bindings.md
@@ -30,3 +30,9 @@ Nim: nim-CLBlast (3rd party)
-------------
A 3rd party CLBlast wrapper for the nim language is available [here](https://github.com/numforge/nim-clblast).
+
+
+Julia: CLBlast.jl (3rd party)
+-------------
+
+A 3rd party CLBlast wrapper for [Julia](https://julialang.org/) is available [here](https://github.com/JuliaGPU/CLBlast.jl).
diff --git a/doc/tuning.md b/doc/tuning.md
index 938c3b6a..3117ffad 100644
--- a/doc/tuning.md
+++ b/doc/tuning.md
@@ -195,6 +195,26 @@ To inspect current behaviour, you can also retrieve the parameters for a specifi
const Precision precision,
std::unordered_map<std::string,size_t> &parameters)
+These two functions require/retrieve the parameters as given in [src/database/kernels](../src/database/kernels), i.e.:
+
+| Kernel name | Parameters |
+| --------------------|-----------------------|
+| Xaxpy | VW, WGS, WPT |
+| Xdot | WGS1, WGS2 |
+| Xgemv | WGS1, WPT1, UNROLL1 |
+| XgemvFast | VW2, WGS2, WPT2 |
+| XgemvFastRot | VW3, WGS3, WPT3 |
+| Xger | WGS1, WGS2, WPT |
+| Xtrsv | TRSV_BLOCK_SIZE |
+| Xgemm | GEMMK, KREG, KWG, KWI, MDIMA, MDIMC, MWG, NDIMB, NDIMC, NWG, SA, SB, STRM, STRN, VWM, VWN |
+| XgemmDirect | KWID, MDIMAD, MDIMCD, NDIMBD, NDIMCD, PADA, PADB, VWMD, VWND, WGD |
+| Copy | COPY_DIMX, COPY_DIMY, COPY_VW, COPY_WPT |
+| Pad | PAD_DIMX, PAD_DIMY, PAD_WPTX, PAD_WPTY |
+| Transpose | TRA_DIM, TRA_PAD, TRA_SHUFFLE, TRA_WPT |
+| Padtranspose | PADTRA_PAD, PADTRA_TILE, PADTRA_WPT |
+| Invert | INTERNAL_BLOCK_SIZE |
+| TrsvRoutine | TRSV_BLOCK_SIZE |
+
Tuning OpenCL compiler options
-------------