summaryrefslogtreecommitdiff
path: root/src/kernels/level3/copy_pad.opencl
AgeCommit message (Collapse)Author
2023-01-03implemented changes to boost Adreno performance according to ↵Angus, Alexander
https://jira-dc.qualcomm.com/jira/browse/OSR-8731
2018-01-08Implemented the in-direct version of the strided-batched GEMM kernelCedric Nugteren
2017-11-29Reformatted unrollable kernel loops and added the new promote_to_registers ↵Cedric Nugteren
pragma for several kernels
2017-07-08Made the inline keyword in kernels optional currently only enabled for ↵Cedric Nugteren
NVIDIA and ARM GPUs
2017-03-19Added batched versions of the pad/copy/transpose kernelsCedric Nugteren
2016-08-20Merge branch 'master' of https://github.com/dvasschemacq/CLBlast into ↵Cedric Nugteren
dvasschemacq-master Conflicts: src/kernels/level1/xaxpy.opencl src/kernels/level2/xgemv.opencl src/kernels/level2/xgemv_fast.opencl src/kernels/level2/xger.opencl src/kernels/level2/xher.opencl src/kernels/level2/xher2.opencl src/kernels/level3/xgemm_part2.opencl
2016-08-18Adapt opencl files for 1.1 OpenCLD. Van Assche
In OpenCL 1.1 __kernel has to be before __attribute__, at least with Vivante compiler.
2016-07-10Now passing alpha/beta to the kernel as arguments as before fp16 support; in ↵Cedric Nugteren
case of fp16 arguments are cast on host and in kernel
2016-06-16Added XOMATCOPY routines to perform out-of-place matrix scaling, copying, ↵Cedric Nugteren
and/or transposing
2016-06-14Re-organised the level-3 supporting kernels (copy, pad, transpose, convert) ↵Cedric Nugteren
and renamed files and functions appropriately