diff options
author | Cedric Nugteren <web@cedricnugteren.nl> | 2018-09-16 20:01:18 +0200 |
---|---|---|
committer | Cedric Nugteren <web@cedricnugteren.nl> | 2018-09-16 20:01:18 +0200 |
commit | 83ba3d4b7ba3a9cb5fbd2c1ad2bb14b2addd39fb (patch) | |
tree | 58900a63158d08e76342b46372fcc59015b4d3ca /doc | |
parent | b7d833901213d03fe5e7f10c15741f55c6c1eb54 (diff) | |
parent | c163868e1822a97750b4380f0d9cdd38369f9f0b (diff) |
Merge branch 'master' into convgemm_multi_kernel
Diffstat (limited to 'doc')
-rw-r--r-- | doc/api.md | 4 | ||||
-rw-r--r-- | doc/bindings.md | 6 | ||||
-rw-r--r-- | doc/tuning.md | 20 |
3 files changed, 28 insertions, 2 deletions
@@ -3512,7 +3512,7 @@ Arguments to FillCache: RetrieveParameters: Retrieves current tuning parameters (auxiliary function) ------------- -This function retrieves current tuning parameters for a specific device-precision-kernel combination. This can be used for debugging or inspection. +This function retrieves current tuning parameters for a specific device-precision-kernel combination. This can be used for debugging or inspection. See [tuning.md](tuning.md) for more details on which kernel names and parameters are valid. C++ API: ``` @@ -3535,7 +3535,7 @@ Arguments to RetrieveParameters (C++ version): OverrideParameters: Override tuning parameters (auxiliary function) ------------- -This function overrides tuning parameters for a specific device-precision-kernel combination. The next time the target routine is called it will be re-compiled and use the new parameters. All further times (until `OverrideParameters` is called again) it will load the kernel from the cache and thus continue to use the new parameters. Note that the first time after calling `OverrideParameters` a performance drop can be observable due to the re-compilation of the kernel. +This function overrides tuning parameters for a specific device-precision-kernel combination. The next time the target routine is called it will be re-compiled and use the new parameters. All further times (until `OverrideParameters` is called again) it will load the kernel from the cache and thus continue to use the new parameters. Note that the first time after calling `OverrideParameters` a performance drop can be observable due to the re-compilation of the kernel. See [tuning.md](tuning.md) for more details on which kernel names and parameters are valid. C++ API: ``` diff --git a/doc/bindings.md b/doc/bindings.md index 3bd3fc7b..85508e68 100644 --- a/doc/bindings.md +++ b/doc/bindings.md @@ -30,3 +30,9 @@ Nim: nim-CLBlast (3rd party) ------------- A 3rd party CLBlast wrapper for the nim language is available [here](https://github.com/numforge/nim-clblast). + + +Julia: CLBlast.jl (3rd party) +------------- + +A 3rd party CLBlast wrapper for [Julia](https://julialang.org/) is available [here](https://github.com/JuliaGPU/CLBlast.jl). diff --git a/doc/tuning.md b/doc/tuning.md index 938c3b6a..3117ffad 100644 --- a/doc/tuning.md +++ b/doc/tuning.md @@ -195,6 +195,26 @@ To inspect current behaviour, you can also retrieve the parameters for a specifi const Precision precision, std::unordered_map<std::string,size_t> ¶meters) +These two functions require/retrieve the parameters as given in [src/database/kernels](../src/database/kernels), i.e.: + +| Kernel name | Parameters | +| --------------------|-----------------------| +| Xaxpy | VW, WGS, WPT | +| Xdot | WGS1, WGS2 | +| Xgemv | WGS1, WPT1, UNROLL1 | +| XgemvFast | VW2, WGS2, WPT2 | +| XgemvFastRot | VW3, WGS3, WPT3 | +| Xger | WGS1, WGS2, WPT | +| Xtrsv | TRSV_BLOCK_SIZE | +| Xgemm | GEMMK, KREG, KWG, KWI, MDIMA, MDIMC, MWG, NDIMB, NDIMC, NWG, SA, SB, STRM, STRN, VWM, VWN | +| XgemmDirect | KWID, MDIMAD, MDIMCD, NDIMBD, NDIMCD, PADA, PADB, VWMD, VWND, WGD | +| Copy | COPY_DIMX, COPY_DIMY, COPY_VW, COPY_WPT | +| Pad | PAD_DIMX, PAD_DIMY, PAD_WPTX, PAD_WPTY | +| Transpose | TRA_DIM, TRA_PAD, TRA_SHUFFLE, TRA_WPT | +| Padtranspose | PADTRA_PAD, PADTRA_TILE, PADTRA_WPT | +| Invert | INTERNAL_BLOCK_SIZE | +| TrsvRoutine | TRSV_BLOCK_SIZE | + Tuning OpenCL compiler options ------------- |