summaryrefslogtreecommitdiff
AgeCommit message (Expand)Author
2019-02-09Added tuning parameters for Xeon E5-2630 v3 and v4Cedric Nugteren
2019-01-26Merge pull request #348 from CNugteren/CLBlast-334-pyclblast-half-precision-s...Cedric Nugteren
2019-01-23Added fp32 to fp16 conversion function in Python to make haxpy example workCedric Nugteren
2019-01-22Added a (non-working) sample of half precision AXPY in PythonCedric Nugteren
2019-01-22Updated pyclblast README, updated to 1.2.0 for half-precision supportCedric Nugteren
2019-01-22Added experimental support for half-precision in pyclblastCedric Nugteren
2019-01-19Merge pull request #345 from CNugteren/convolution-fixes-and-tunerCedric Nugteren
2019-01-19Added documentation on the convgemm routineCedric Nugteren
2019-01-19Added a few more initial Intel tuning parameters for convgemmCedric Nugteren
2019-01-05Added a check to prevent the stride of matrix C being set to 0 for the stride...Cedric Nugteren
2018-12-31Added convgemm to the CLBlast database, added initial parameters for Skylake GPUCedric Nugteren
2018-12-31Added support for the convgemm tuner in the tuner databaseCedric Nugteren
2018-12-31Added the forgotten batch dimension to the tuner to get correct kernel execut...Cedric Nugteren
2018-12-23Merge pull request #343 from vbkaisetsu/feature/convgemm-singleCedric Nugteren
2018-12-22Merge branch 'master' into convolution-fixes-and-tunerCedric Nugteren
2018-12-21Update changelogKoichi Akabe
2018-12-18Update the documentationKoichi Akabe
2018-12-18Fix the xconvgemm tunerKoichi Akabe
2018-12-18Added first version of a tuner for the ConvGemm direct kernelCedric Nugteren
2018-12-18Fix xconvgemm kernel and enable ConvGemmMethod::kSingleKernelKoichi Akabe
2018-12-17Merge pull request #342 from vbkaisetsu/fix/im2col-hf-testsCedric Nugteren
2018-12-17Fix half-float+kernel_mode test cases of im2col, col2im, and convgemmKoichi Akabe
2018-12-04Updated to version 1.5.0Cedric Nugteren
2018-12-01Updated the roadmap documentCedric Nugteren
2018-12-01Added a FAQ documentCedric Nugteren
2018-12-01Merge pull request #341 from CNugteren/CLBlast-340-GEMMK1-issue-with-unequal-...Cedric Nugteren
2018-11-30Fixed an issue for unequal MWG and NWG and the new GEMMK == 1 kernelCedric Nugteren
2018-11-19Merge pull request #335 from vbkaisetsu/patch-1Cedric Nugteren
2018-11-19Remove unnecessary attribute of inline functionKoichi Akabe
2018-11-17Merge pull request #332 from vbkaisetsu/feature/im2col-col2im-flipCedric Nugteren
2018-11-12Add kernel_mode option to im2col, col2im, and convgemm functionsKoichi Akabe
2018-11-09Merge pull request #331 from CNugteren/CLBlast-270-col2imCedric Nugteren
2018-11-07Changed col2im to append to the existing im-bufferCedric Nugteren
2018-11-01Added new col2im routine to the documentationCedric Nugteren
2018-11-01Fixed half-precision tests for im2col and col2imCedric Nugteren
2018-10-31Merge pull request #330 from vbkaisetsu/CLBlast-270-col2imCedric Nugteren
2018-10-30Fix col2im implementationKoichi Akabe
2018-10-29Merge pull request #329 from tholu/patch-1Cedric Nugteren
2018-10-28Update FindOpenCL.cmakeThomas Lutz
2018-10-23Added groundwork for col2im algorithm plus first non-working version of kerne...Cedric Nugteren
2018-10-22Some name changes in im2col codeCedric Nugteren
2018-10-17Fixed MSVC's compilation error C1061 due to too many for-loopsCedric Nugteren
2018-10-17Fixed a bug with the pre-processing and the AXPY kernelCedric Nugteren
2018-10-16Merge pull request #325 from CNugteren/CLBlast-321-axpy-faster-kernel-bugCedric Nugteren
2018-10-15Fixed a bug in the XaxpyFaster kernel for specific parametersCedric Nugteren
2018-10-14Merge pull request #319 from CNugteren/convgemm_multi_kernelCedric Nugteren
2018-10-14Merge pull request #324 from CNugteren/CLBlast-315-tuning-api-improvementsCedric Nugteren
2018-10-13Updated changelog regarding tuning API changeCedric Nugteren
2018-10-13Made tuning API more flexible: disregards any extra parameter valuesCedric Nugteren
2018-10-13Updated the documentation for GEMV tuningCedric Nugteren