summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-05-19Merge branch 'master' into CLBlast-267-convgemmCedric Nugteren
2018-05-19Merge pull request #284 from CNugteren/routine_tuners_read_kernel_json_from_diskCedric Nugteren
Routine tuners read kernel JSON from disk
2018-05-19Fixed compilation issuesCedric Nugteren
2018-05-19The GEMM routine tuner now loads kernel JSON tuning results from disk if ↵Cedric Nugteren
available; now run part of alltuners target
2018-05-19Fixed a bug in loading xgemm-direct JSON data from diskCedric Nugteren
2018-05-18Merge pull request #283 from CNugteren/canary_buffer_overflow_protectionCedric Nugteren
Canary buffer overflow protection
2018-05-18Merge branch 'master' into canary_buffer_overflow_protectionCedric Nugteren
2018-05-17Merge pull request #282 from CNugteren/CLBlast-276-program-release-improvementsCedric Nugteren
Better cache behaviour of OpenCL programs
2018-05-17Updated the roadmapCedric Nugteren
2018-05-17Updated README with IWOCL talk and GPU zoo acknowledgmentCedric Nugteren
2018-05-17Added documentation on some details of the GEMM implementationCedric Nugteren
2018-05-17Fixed a few issues with canary region testingCedric Nugteren
2018-05-17Added a canary region for overflow detection to the correctness testsCedric Nugteren
2018-05-17Added a canary region for overflow detection to the tunersCedric Nugteren
2018-05-17First version of direct reading from image tensor for convgemm: only for ↵Cedric Nugteren
edge cases now
2018-05-13Created a dedicated convgemm GEMM kernel as a copy of the batched direct ↵Cedric Nugteren
gemm kernel
2018-05-13Plugged in the code of strided-batched-gemm into convgemm in preparation of ↵Cedric Nugteren
a new kernel
2018-05-09Changed temporary convgemm implementation to use batched-strided GEMMCedric Nugteren
2018-05-09Fixed the performance client for convgemm and added GFLOPS measurementsCedric Nugteren
2018-05-09Merge pull request #279 from umar456/ci_linksCedric Nugteren
Update ci links to use doman names and build names instead of IP/id
2018-05-09Updated the documentation for convgemm to include data layout (NCHW)Cedric Nugteren
2018-05-09Implemented convolution as im2col + GEMMCedric Nugteren
2018-05-09Split channels/strides testing values off from kernel sizes for more flexibilityCedric Nugteren
2018-05-08Update ci links to use doman names and build names instead of IP/idUmar Arshad
Updates the README badges to point to the domain name instead of IP addresses. Also updates the names of the builds to the name of the build instead of the id of the build.
2018-05-06Added convgemm skeleton, test infrastructure, and first reference implementationCedric Nugteren
2018-05-05Added interface of batched convolution as GEMMCedric Nugteren
2018-05-01Updated README with new badges and paper citationCedric Nugteren
2018-05-01Now stores a shared_ptr to the Program class in the cacheCedric Nugteren
2018-04-29Merge pull request #277 from CNugteren/CLBlast-257-intel-subgroupsCedric Nugteren
Intel subgroup shuffling
2018-04-29Updated the changelogCedric Nugteren
2018-04-29Updated the roadmapCedric Nugteren
2018-04-26Fixed an access violation when compiled with Visual Studio upon releasing ↵Cedric Nugteren
the OpenCL program
2018-04-24Added Intel subgroup shuffle support to the 2D register caching GEMM kernelCedric Nugteren
2018-04-24Added a define to enable subgroup shuffling if supported by the deviceCedric Nugteren
2018-04-21Merge pull request #274 from CNugteren/CLBlast-228-2d-register-gemm-kernelCedric Nugteren
Added 2D-register-caching GEMM kernel
2018-04-20Fixes for the CUDA APICedric Nugteren
2018-04-18Expressed HER2K as two HERK callsCedric Nugteren
2018-04-18Expressed SYR2K as two SYRK callsCedric Nugteren
2018-04-17Updated HERK and SYRK to follow the GEMM style and functions to make it work ↵Cedric Nugteren
with the new kernel
2018-04-15Fixed some failing tests for GEMM and batched GEMM routinesCedric Nugteren
2018-04-15Updated tuning results for the Skylake ULT GT2 GPU with the new kernelCedric Nugteren
2018-04-13Made GEMM rotation expectations kernel-specificCedric Nugteren
2018-04-10Updated database with defaults of GEMMK=0 and KREG=1Cedric Nugteren
2018-04-10Made it possible to add tuning parameters to the database using the scriptCedric Nugteren
2018-04-10Fixed a bug in the compression part of the database scriptCedric Nugteren
2018-04-08Extended the maximum number of tuning parameters from 14 to 16Cedric Nugteren
2018-04-08Fixed issues with the pre-processorCedric Nugteren
2018-04-07Merge branch 'master' into CLBlast-228-2d-register-gemm-kernelCedric Nugteren
2018-04-07Added tuning results for NVIDIA GeForce 970Cedric Nugteren
2018-04-07Added tuning results for NVIDIA GeForce 920MXCedric Nugteren