summaryrefslogtreecommitdiff
path: root/src/kernels/level3
AgeCommit message (Expand)Author
2018-01-08Implemented the in-direct version of the strided-batched GEMM kernelCedric Nugteren
2018-01-07Implemented direct version of strided-batched GEMM kernelCedric Nugteren
2017-12-31Revert "Added options to disable parts of the invert kernel to find out where...Cedric Nugteren
2017-12-31Changed the invert kernel slightly; added part1a/part1b disable-definesCedric Nugteren
2017-12-30Fixed ifdef's into ifndef'sCedric Nugteren
2017-12-30Added options to disable parts of the invert kernel to find out where the AMD...Cedric Nugteren
2017-12-27Simplified invert kernel a littleCedric Nugteren
2017-12-23Split the invert kernel in two parts to prevent error C1091 in MSVC 2013Cedric Nugteren
2017-12-19Added skeleton for a tuner for the invert kernelCedric Nugteren
2017-12-10Split GEMM kernel in 4 files instead of 3 due to MSVC 2013 string length limitCedric Nugteren
2017-12-09Completed kernel modifications for pre-processor of all other kernelsCedric Nugteren
2017-12-09Modified the direct GEMM kernel to support array-to-register promotionCedric Nugteren
2017-12-09Reformatted GEMM kernel to support array-to-register promotionCedric Nugteren
2017-12-09Fixed defines parsing and substituting in pre-processor; fixed some variable ...Cedric Nugteren
2017-12-07Added register promotion to the main GEMM kernelCedric Nugteren
2017-12-03Added GEMM (direct and in-direct) to the pre-processor testing; modified the ...Cedric Nugteren
2017-12-03Reformated transpose kernels for the pre-processor; extended the amount of testsCedric Nugteren
2017-11-29Reformatted unrollable kernel loops and added the new promote_to_registers pr...Cedric Nugteren
2017-10-14Fixed a kernel/attribute order bug in the direct GEMM kernelsCedric Nugteren
2017-10-14Make local memory pointers a define in OpenCL; some fixes to the recently cha...Cedric Nugteren
2017-10-14Made transpose kernel struct init proper according to the C standardCedric Nugteren
2017-10-03Gemm in-direct implementation now uses only 1 larger instead of max 3 optiona...Cedric Nugteren
2017-07-08Made the inline keyword in kernels optional currently only enabled for NVIDIA...Cedric Nugteren
2017-06-30Fixed an if-statement in the direct GEMM kernel causing a bug with specific s...Cedric Nugteren
2017-05-14Fixed a missing synchronization barrier in the invert kernel; fixes TRSM testsCedric Nugteren
2017-03-19Added an (optional) non-direct implementation of the batched GEMM routineCedric Nugteren
2017-03-19Added batched versions of the pad/copy/transpose kernelsCedric Nugteren
2017-03-11Added initial naive version of the batched GEMM routine based on the direct G...Cedric Nugteren
2017-03-04Added a proper data-preparation function for the TRSM testsCedric Nugteren
2017-02-26Fixed an out-of-bounds memory access when filling a matrix with a constantCedric Nugteren
2017-02-26Fixes division in the kernel for inversion of complex numbersCedric Nugteren
2017-02-25Added PrepareData function for TRSM to create proper test inputCedric Nugteren
2017-01-18Added first version of the TRSM routine based on the diagonal invert kernelCedric Nugteren
2017-01-15Added a first version of the diagonal block invert routine in preparation of ...Cedric Nugteren
2016-12-18Fixed a bug when using offsets in the direct GEMM kernelsCedric Nugteren
2016-10-22Fixed a bug in the SYRK/SYR2K/HERK/HER2K routines that would occur with speci...Cedric Nugteren
2016-10-22Fixed a bug in the SYRK/SYR2K/HERK/HER2K routines that would occur with speci...Cedric Nugteren
2016-10-03Fixed a const-correctness issue with complex conjugation in the GEMM direct k...Cedric Nugteren
2016-10-03Added functions to load from off-chip to local memory without vector loads fo...Cedric Nugteren
2016-10-03Re-organised GEMM direct kernel and added faster fall-back version for incomp...Cedric Nugteren
2016-10-02Specialised the GEMM direct kernel in four ways for transposing/non-transposi...Cedric Nugteren
2016-10-02Split the GEMM direct kernel into two files; set the default tuning target to...Cedric Nugteren
2016-10-01Added padding to the local memory of the GEMM direct kernelCedric Nugteren
2016-09-25Added a first version of a tuner for the GEMM direct kernel; collapsed MWGD, ...Cedric Nugteren
2016-09-25Separated the tuning parameters of the new direct GEMM kernel from the indire...Cedric Nugteren
2016-09-25Added a first version of the direct version of GEMM with local memoryCedric Nugteren
2016-09-21Merge branch 'development' into gemm_directCedric Nugteren
2016-09-12Split the XGEMM kernel further up: now in 3 parts. This is done because MSVC ...Cedric Nugteren
2016-09-04The GEMM kernel no longer adds beta*C in case beta is zero; this would cause ...Cedric Nugteren
2016-08-20Merge branch 'master' of https://github.com/dvasschemacq/CLBlast into dvassch...Cedric Nugteren