summaryrefslogtreecommitdiff
AgeCommit message (Expand)Author
2018-11-12Add kernel_mode option to im2col, col2im, and convgemm functionsKoichi Akabe
2018-11-09Merge pull request #331 from CNugteren/CLBlast-270-col2imCedric Nugteren
2018-11-07Changed col2im to append to the existing im-bufferCedric Nugteren
2018-11-01Added new col2im routine to the documentationCedric Nugteren
2018-11-01Fixed half-precision tests for im2col and col2imCedric Nugteren
2018-10-31Merge pull request #330 from vbkaisetsu/CLBlast-270-col2imCedric Nugteren
2018-10-30Fix col2im implementationKoichi Akabe
2018-10-29Merge pull request #329 from tholu/patch-1Cedric Nugteren
2018-10-28Update FindOpenCL.cmakeThomas Lutz
2018-10-23Added groundwork for col2im algorithm plus first non-working version of kerne...Cedric Nugteren
2018-10-22Some name changes in im2col codeCedric Nugteren
2018-10-17Fixed MSVC's compilation error C1061 due to too many for-loopsCedric Nugteren
2018-10-17Fixed a bug with the pre-processing and the AXPY kernelCedric Nugteren
2018-10-16Merge pull request #325 from CNugteren/CLBlast-321-axpy-faster-kernel-bugCedric Nugteren
2018-10-15Fixed a bug in the XaxpyFaster kernel for specific parametersCedric Nugteren
2018-10-14Merge pull request #319 from CNugteren/convgemm_multi_kernelCedric Nugteren
2018-10-14Merge pull request #324 from CNugteren/CLBlast-315-tuning-api-improvementsCedric Nugteren
2018-10-13Updated changelog regarding tuning API changeCedric Nugteren
2018-10-13Made tuning API more flexible: disregards any extra parameter valuesCedric Nugteren
2018-10-13Updated the documentation for GEMV tuningCedric Nugteren
2018-10-11Merge pull request #323 from CNugteren/CLBlast-322-fix-preprocessor-warningsCedric Nugteren
2018-10-10Fixed pre-processor warnings related to the subgroup shufflingCedric Nugteren
2018-09-16Merge branch 'master' into convgemm_multi_kernelCedric Nugteren
2018-09-15Merge pull request #318 from CNugteren/CLBlast-315-preprocessor-gemmk1-issueCedric Nugteren
2018-09-15Fixed an MSVC compilation error due to large stringsCedric Nugteren
2018-09-15Added a kernel-parameter pair table to document the tuning APICedric Nugteren
2018-09-15Fixed an MSVC compilation error due to large stringsCedric Nugteren
2018-09-15Disabled Intel subgroup shuffling for double-precisionCedric Nugteren
2018-09-15Fixed issues with GEMMK=1 kernel and the pre-processorCedric Nugteren
2018-09-15Added pre-processor test for GEMMK=1 kernelCedric Nugteren
2018-09-07Reduced size of the xCONVGEMM correctness testsCedric Nugteren
2018-09-07Added reference implementation for xCONVGEMM for half-precisionCedric Nugteren
2018-09-07Added xCONVGEMM as im2col plus a batched GEMM kernelCedric Nugteren
2018-09-03Merge pull request #316 from ranocha/patch-1Cedric Nugteren
2018-09-03Add Julia WrapperHendrik Ranocha
2018-08-14Merge pull request #312 from CNugteren/CLBlast-311-missing-event-in-trsv-trsmCedric Nugteren
2018-08-13Made last operation in TRSV and TRSM asynchronous, making the events not nullCedric Nugteren
2018-08-13Small refactoring of events in TRSV substitution routineCedric Nugteren
2018-08-09Merge pull request #310 from CNugteren/CLBlast-307-netlib-api-static-opencl-varsCedric Nugteren
2018-08-07Name change of setting to NETLIB_PERSISTENT_OPENCLCedric Nugteren
2018-08-05Added an option to compile the Netlib API with static OpenCL device and contextCedric Nugteren
2018-08-02Merge pull request #309 from CNugteren/CLBlast-306-omatcopy-conjugateCedric Nugteren
2018-07-31Merge pull request #308 from CNugteren/CLBlast-301-weird-AMD-Hainan-bugCedric Nugteren
2018-07-31Fixed issue with not performing complex conjugation under certain cases when ...Cedric Nugteren
2018-07-31Fixed the tests of OMATCOPY to include proper complex conjugationCedric Nugteren
2018-07-31Fixed an error reporting issue related to the canary regionCedric Nugteren
2018-07-31Added note about AMD southern islands GPU issue and the required workaroundCedric Nugteren
2018-07-31Added Beignet 1.2.1 requirement to the README for IvyBridge GPUsCedric Nugteren
2018-07-31Updated the tuning results for Intel IvyBridge M GT2Cedric Nugteren
2018-07-30Merge pull request #305 from CNugteren/CLBlast-303-tuner-check-local-sizeCedric Nugteren