summaryrefslogtreecommitdiff
path: root/CHANGELOG
diff options
context:
space:
mode:
authorCedric Nugteren <web@cedricnugteren.nl>2017-10-16 21:54:23 +0200
committerCedric Nugteren <web@cedricnugteren.nl>2017-10-16 21:54:42 +0200
commit03760f80eb7eb07450da379d129ba64d92bfcc41 (patch)
tree81b1466c86e9bbb3c4dc52f223b21d21c55d6092 /CHANGELOG
parent0719f1448655192d2ce6c17ee51c770ef16dd120 (diff)
Added CUDA API documentation
Diffstat (limited to 'CHANGELOG')
-rw-r--r--CHANGELOG4
1 files changed, 4 insertions, 0 deletions
diff --git a/CHANGELOG b/CHANGELOG
index bb2013a6..a2416dd3 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -1,5 +1,9 @@
Development (next version)
+- Added a CUDA API to CLBlast:
+ * The library and kernels can be compiled with the CUDA driver API and NVRTC (requires CUDA 7.5)
+ * Two CUDA API sample programs are added: SGEMM and DAXPY
+ * All correctness tests and performance clients work on CUDA like they did for OpenCL
- Kernels are now cached based on their tuning parameters: fits the use-case of 'OverrideParameters'
- Improved performance for small GEMM problems by going from 3 to 1 optional temporary buffers
- Various minor fixes and enhancements