summaryrefslogtreecommitdiff
path: root/CHANGELOG
diff options
context:
space:
mode:
authorCedric Nugteren <web@cedricnugteren.nl>2018-09-16 20:01:18 +0200
committerCedric Nugteren <web@cedricnugteren.nl>2018-09-16 20:01:18 +0200
commit83ba3d4b7ba3a9cb5fbd2c1ad2bb14b2addd39fb (patch)
tree58900a63158d08e76342b46372fcc59015b4d3ca /CHANGELOG
parentb7d833901213d03fe5e7f10c15741f55c6c1eb54 (diff)
parentc163868e1822a97750b4380f0d9cdd38369f9f0b (diff)
Merge branch 'master' into convgemm_multi_kernel
Diffstat (limited to 'CHANGELOG')
-rw-r--r--CHANGELOG4
1 files changed, 4 insertions, 0 deletions
diff --git a/CHANGELOG b/CHANGELOG
index 53958d6f..63179e95 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -1,7 +1,11 @@
Development (next version)
- Added support for shuffle instructions for NVIDIA GPUs (thanks to 'tyler-utah')
+- Added an option to compile the Netlib API with static OpenCL device and context (-DNETLIB_PERSISTENT_OPENCL=ON)
+- The tuners now check beforehand on invalid local thread sizes and skip those completely
+- Fixed an issue with conjugate transpose not being executed in certain cases for a.o. XOMATCOPY
- Fixed an issue with AMD GPUs and the new GEMMK == 1 kernel
+- Fixed an issue with the preprocessor and the new GEMMK == 1 kernel
- Various minor fixes and enhancements
- Added non-BLAS routines:
* SCONVGEMM/DCONVGEMM/HCONVGEMM (convolution as im2col followed by batched GEMM)