summaryrefslogtreecommitdiff
path: root/README.md
diff options
context:
space:
mode:
authorCedric Nugteren <web@cedricnugteren.nl>2017-10-16 21:54:23 +0200
committerCedric Nugteren <web@cedricnugteren.nl>2017-10-16 21:54:42 +0200
commit03760f80eb7eb07450da379d129ba64d92bfcc41 (patch)
tree81b1466c86e9bbb3c4dc52f223b21d21c55d6092 /README.md
parent0719f1448655192d2ce6c17ee51c770ef16dd120 (diff)
Added CUDA API documentation
Diffstat (limited to 'README.md')
-rw-r--r--README.md14
1 files changed, 13 insertions, 1 deletions
diff --git a/README.md b/README.md
index c13770f6..dac47fce 100644
--- a/README.md
+++ b/README.md
@@ -99,11 +99,23 @@ To get started quickly, a couple of stand-alone example programs are included in
cmake -DSAMPLES=ON ..
+For all of CLBlast's APIs, it is possible to optionally set an OS environmental variable `CLBLAST_BUILD_OPTIONS` to pass specific build options to the OpenCL compiler.
+
+
+Using the library (Netlib API)
+-------------
+
There is also a Netlib CBLAS C API available. This is however not recommended for full control over performance, since at every call it will copy all buffers to and from the OpenCL device. Especially for level 1 and level 2 BLAS functions performance will be impacted severely. However, it can be useful if you don't want to touch OpenCL at all. You can set the default device and platform by setting the `CLBLAST_DEVICE` and `CLBLAST_PLATFORM` environmental variables. This API can be used as follows after providing the `-DNETLIB=ON` flag to CMake:
#include <clblast_netlib_c.h>
-For all of CLBlast's APIs, it is possible to optionally set an OS environmental variable `CLBLAST_BUILD_OPTIONS` to pass specific build options to the OpenCL compiler.
+
+Using the library (CUDA API)
+-------------
+
+There is also a CUDA API of CLBlast available. Enabling this compiles the whole library for CUDA and thus replaces the OpenCL API. It is based upon the CUDA runtime and NVRTC APIs, requiring NVIDIA CUDA 7.5 or higher. The CUDA version of the library can be used as follows after providing the `-DCUDA=ON -DOPENCL=OFF` flags to CMake:
+
+ #include <clblast_cuda.h>
Using the tuners (optional)