From 9b0a435fb00b845b875590be90acffcd4f3bb009 Mon Sep 17 00:00:00 2001 From: Cedric Nugteren Date: Thu, 2 Nov 2017 21:47:14 +0100 Subject: Integrated the GEMM routine tuner for kernel selection; added first tuning results --- CHANGELOG | 1 + 1 file changed, 1 insertion(+) (limited to 'CHANGELOG') diff --git a/CHANGELOG b/CHANGELOG index 14a6dd22..c565559f 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -8,6 +8,7 @@ Development (next version) * All correctness tests and performance clients work on CUDA like they did for OpenCL - Kernels are now cached based on their tuning parameters: fits the use-case of 'OverrideParameters' - Improved performance for small GEMM problems by going from 3 to 1 optional temporary buffers +- GEMM kernel selection (direct vs in-direct) is now done automatically using a new tuner - Various minor fixes and enhancements - Added tuned parameters for various devices (see README) -- cgit v1.2.3