summaryrefslogtreecommitdiff
path: root/src
AgeCommit message (Expand)Author
2019-02-09Added tuning parameters for Tesla P100 16GBCedric Nugteren
2019-02-09Added tuning parameters for Xeon E5-2630 v3 and v4Cedric Nugteren
2019-01-23Added fp32 to fp16 conversion function in Python to make haxpy example workCedric Nugteren
2019-01-22Added a (non-working) sample of half precision AXPY in PythonCedric Nugteren
2019-01-22Updated pyclblast README, updated to 1.2.0 for half-precision supportCedric Nugteren
2019-01-22Added experimental support for half-precision in pyclblastCedric Nugteren
2019-01-19Merge pull request #345 from CNugteren/convolution-fixes-and-tunerCedric Nugteren
2019-01-19Added a few more initial Intel tuning parameters for convgemmCedric Nugteren
2019-01-05Added a check to prevent the stride of matrix C being set to 0 for the stride...Cedric Nugteren
2018-12-31Added convgemm to the CLBlast database, added initial parameters for Skylake GPUCedric Nugteren
2018-12-31Added support for the convgemm tuner in the tuner databaseCedric Nugteren
2018-12-31Added the forgotten batch dimension to the tuner to get correct kernel execut...Cedric Nugteren
2018-12-18Fix the xconvgemm tunerKoichi Akabe
2018-12-18Added first version of a tuner for the ConvGemm direct kernelCedric Nugteren
2018-12-18Fix xconvgemm kernel and enable ConvGemmMethod::kSingleKernelKoichi Akabe
2018-11-30Fixed an issue for unequal MWG and NWG and the new GEMMK == 1 kernelCedric Nugteren
2018-11-19Remove unnecessary attribute of inline functionKoichi Akabe
2018-11-12Add kernel_mode option to im2col, col2im, and convgemm functionsKoichi Akabe
2018-11-07Changed col2im to append to the existing im-bufferCedric Nugteren
2018-11-01Added new col2im routine to the documentationCedric Nugteren
2018-10-30Fix col2im implementationKoichi Akabe
2018-10-23Added groundwork for col2im algorithm plus first non-working version of kerne...Cedric Nugteren
2018-10-22Some name changes in im2col codeCedric Nugteren
2018-10-17Fixed a bug with the pre-processing and the AXPY kernelCedric Nugteren
2018-10-15Fixed a bug in the XaxpyFaster kernel for specific parametersCedric Nugteren
2018-10-14Merge pull request #319 from CNugteren/convgemm_multi_kernelCedric Nugteren
2018-10-13Made tuning API more flexible: disregards any extra parameter valuesCedric Nugteren
2018-10-10Fixed pre-processor warnings related to the subgroup shufflingCedric Nugteren
2018-09-16Merge branch 'master' into convgemm_multi_kernelCedric Nugteren
2018-09-15Fixed an MSVC compilation error due to large stringsCedric Nugteren
2018-09-15Fixed an MSVC compilation error due to large stringsCedric Nugteren
2018-09-15Disabled Intel subgroup shuffling for double-precisionCedric Nugteren
2018-09-15Fixed issues with GEMMK=1 kernel and the pre-processorCedric Nugteren
2018-09-07Added xCONVGEMM as im2col plus a batched GEMM kernelCedric Nugteren
2018-08-13Made last operation in TRSV and TRSM asynchronous, making the events not nullCedric Nugteren
2018-08-13Small refactoring of events in TRSV substitution routineCedric Nugteren
2018-08-07Name change of setting to NETLIB_PERSISTENT_OPENCLCedric Nugteren
2018-08-05Added an option to compile the Netlib API with static OpenCL device and contextCedric Nugteren
2018-08-02Merge pull request #309 from CNugteren/CLBlast-306-omatcopy-conjugateCedric Nugteren
2018-07-31Merge pull request #308 from CNugteren/CLBlast-301-weird-AMD-Hainan-bugCedric Nugteren
2018-07-31Fixed issue with not performing complex conjugation under certain cases when ...Cedric Nugteren
2018-07-31Updated the tuning results for Intel IvyBridge M GT2Cedric Nugteren
2018-07-29Fixed a wrong event issue causing error -57Cedric Nugteren
2018-07-29Removed complex numbers support for CONVGEMMCedric Nugteren
2018-07-29Merge branch 'master' into CLBlast-267-convgemmCedric Nugteren
2018-07-28Added print statements to indicate the 4 stages of GEMM tuningCedric Nugteren
2018-07-28The tuners now also check for valid local thread configurations and skip inva...Cedric Nugteren
2018-07-28Disabled the use of staggered indices on AMD GPUs for the new GEMMK == 1 kern...Cedric Nugteren
2018-07-27Fixed an issue with AMD GPUs and the new GEMMK == 1 kernelCedric Nugteren
2018-07-27Fixed a bug: forgot to initialize the shared pointer for the null kernelCedric Nugteren