diff options
author | Cedric Nugteren <web@cedricnugteren.nl> | 2018-04-29 15:48:35 +0200 |
---|---|---|
committer | GitHub <noreply@github.com> | 2018-04-29 15:48:35 +0200 |
commit | b2248a17ae24ba72618d80b98196221049cc3933 (patch) | |
tree | eb016ea1987926433ccf0ccfebabf3ee972200d8 /CHANGELOG | |
parent | 7b416c8686f7cc83c79c886e24851db33baacf80 (diff) | |
parent | 9f22bc232ba30b55b7c7fece21e0906720f079a4 (diff) |
Merge pull request #277 from CNugteren/CLBlast-257-intel-subgroups
Intel subgroup shuffling
Diffstat (limited to 'CHANGELOG')
-rw-r--r-- | CHANGELOG | 1 |
1 files changed, 1 insertions, 0 deletions
@@ -4,6 +4,7 @@ Development (next version) - Added CLBlast to Ubuntu PPA and macOS Homebrew package managers - Added an API to run the tuners programmatically without any I/O - Improved the performance potential by adding a second tunable GEMM kernel with 2D register tiling +- Added support for Intel specific subgroup shuffling extensions for faster GEMM on Intel GPUs - Re-added a local memory size constraint to the tuners - Updated and reorganised the CLBlast documentation - Fixed an access violation when compiled with Visual Studio upon releasing the OpenCL program |