summaryrefslogtreecommitdiff
path: root/CHANGELOG
diff options
context:
space:
mode:
authorCedric Nugteren <web@cedricnugteren.nl>2018-04-29 15:48:35 +0200
committerGitHub <noreply@github.com>2018-04-29 15:48:35 +0200
commitb2248a17ae24ba72618d80b98196221049cc3933 (patch)
treeeb016ea1987926433ccf0ccfebabf3ee972200d8 /CHANGELOG
parent7b416c8686f7cc83c79c886e24851db33baacf80 (diff)
parent9f22bc232ba30b55b7c7fece21e0906720f079a4 (diff)
Merge pull request #277 from CNugteren/CLBlast-257-intel-subgroups
Intel subgroup shuffling
Diffstat (limited to 'CHANGELOG')
-rw-r--r--CHANGELOG1
1 files changed, 1 insertions, 0 deletions
diff --git a/CHANGELOG b/CHANGELOG
index 621fa9b9..5f3ef371 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -4,6 +4,7 @@ Development (next version)
- Added CLBlast to Ubuntu PPA and macOS Homebrew package managers
- Added an API to run the tuners programmatically without any I/O
- Improved the performance potential by adding a second tunable GEMM kernel with 2D register tiling
+- Added support for Intel specific subgroup shuffling extensions for faster GEMM on Intel GPUs
- Re-added a local memory size constraint to the tuners
- Updated and reorganised the CLBlast documentation
- Fixed an access violation when compiled with Visual Studio upon releasing the OpenCL program