summaryrefslogtreecommitdiff
path: root/CHANGELOG
diff options
context:
space:
mode:
authorCedric Nugteren <web@cedricnugteren.nl>2017-11-02 21:47:14 +0100
committerCedric Nugteren <web@cedricnugteren.nl>2017-11-02 21:47:14 +0100
commit9b0a435fb00b845b875590be90acffcd4f3bb009 (patch)
tree754b523789ef717619b540925c97e7167ba28f06 /CHANGELOG
parent73272ab97dbd5abe757f6558c9b89665c5ac99d0 (diff)
Integrated the GEMM routine tuner for kernel selection; added first tuning results
Diffstat (limited to 'CHANGELOG')
-rw-r--r--CHANGELOG1
1 files changed, 1 insertions, 0 deletions
diff --git a/CHANGELOG b/CHANGELOG
index 14a6dd22..c565559f 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -8,6 +8,7 @@ Development (next version)
* All correctness tests and performance clients work on CUDA like they did for OpenCL
- Kernels are now cached based on their tuning parameters: fits the use-case of 'OverrideParameters'
- Improved performance for small GEMM problems by going from 3 to 1 optional temporary buffers
+- GEMM kernel selection (direct vs in-direct) is now done automatically using a new tuner
- Various minor fixes and enhancements
- Added tuned parameters for various devices (see README)