diff options
author | Cedric Nugteren <web@cedricnugteren.nl> | 2018-09-16 20:01:18 +0200 |
---|---|---|
committer | Cedric Nugteren <web@cedricnugteren.nl> | 2018-09-16 20:01:18 +0200 |
commit | 83ba3d4b7ba3a9cb5fbd2c1ad2bb14b2addd39fb (patch) | |
tree | 58900a63158d08e76342b46372fcc59015b4d3ca /CHANGELOG | |
parent | b7d833901213d03fe5e7f10c15741f55c6c1eb54 (diff) | |
parent | c163868e1822a97750b4380f0d9cdd38369f9f0b (diff) |
Merge branch 'master' into convgemm_multi_kernel
Diffstat (limited to 'CHANGELOG')
-rw-r--r-- | CHANGELOG | 4 |
1 files changed, 4 insertions, 0 deletions
@@ -1,7 +1,11 @@ Development (next version) - Added support for shuffle instructions for NVIDIA GPUs (thanks to 'tyler-utah') +- Added an option to compile the Netlib API with static OpenCL device and context (-DNETLIB_PERSISTENT_OPENCL=ON) +- The tuners now check beforehand on invalid local thread sizes and skip those completely +- Fixed an issue with conjugate transpose not being executed in certain cases for a.o. XOMATCOPY - Fixed an issue with AMD GPUs and the new GEMMK == 1 kernel +- Fixed an issue with the preprocessor and the new GEMMK == 1 kernel - Various minor fixes and enhancements - Added non-BLAS routines: * SCONVGEMM/DCONVGEMM/HCONVGEMM (convolution as im2col followed by batched GEMM) |