diff options
Diffstat (limited to 'CHANGELOG')
-rw-r--r-- | CHANGELOG | 4 |
1 files changed, 4 insertions, 0 deletions
@@ -1,7 +1,11 @@ Development (next version) - Added support for shuffle instructions for NVIDIA GPUs (thanks to 'tyler-utah') +- Added an option to compile the Netlib API with static OpenCL device and context (-DNETLIB_PERSISTENT_OPENCL=ON) +- The tuners now check beforehand on invalid local thread sizes and skip those completely +- Fixed an issue with conjugate transpose not being executed in certain cases for a.o. XOMATCOPY - Fixed an issue with AMD GPUs and the new GEMMK == 1 kernel +- Fixed an issue with the preprocessor and the new GEMMK == 1 kernel - Various minor fixes and enhancements - Added non-BLAS routines: * SCONVGEMM/DCONVGEMM/HCONVGEMM (convolution as im2col followed by batched GEMM) |