Age | Commit message (Expand) | Author |
2017-02-13 | Fixed a small bug in GEMV: unused kernel in parameter list | Cedric Nugteren |
2017-02-12 | Split the database into several smaller cached per-kernel databases (in prepa... | Cedric Nugteren |
2017-01-24 | Routine, Cache: generalize, reduce amount of copying in fast path | Ivan Shapovalov |
2017-01-20 | treewide: include clpp11.hpp first to silence deprecation warnings | Ivan Shapovalov |
2016-12-18 | Fixed a bug when using offsets in the direct GEMM kernels | Cedric Nugteren |
2016-11-24 | Merge pull request #125 from CNugteren/netlib_blas_api | Cedric Nugteren |
2016-11-23 | Fixed a bug in the HSCAL routine | Cedric Nugteren |
2016-11-20 | Now correctly tests for validaty of the B matrix in the TRMM routine | Cedric Nugteren |
2016-11-20 | Fixed a bug in the TRMM routine caused by overwriting input data before consu... | Cedric Nugteren |
2016-10-23 | Fixed a bug in the transpose-matrix function | Cedric Nugteren |
2016-10-22 | Merge pull request #117 from intelfx/exceptions | Cedric Nugteren |
2016-10-22 | Fixed a bug in the SYRK/SYR2K/HERK/HER2K routines that would occur with speci... | Cedric Nugteren |
2016-10-22 | Routine: get rid of ::SetUp() | Ivan Shapovalov |
2016-10-22 | treewide: use C++ exceptions properly | Ivan Shapovalov |
2016-10-10 | Fixed an issue with the length of the GEMM OpenCL string for both MSVC 2013 a... | Cedric Nugteren |
2016-10-06 | Added a kernel selection database to select between the direct and indirect G... | Cedric Nugteren |
2016-10-03 | Re-organised GEMM direct kernel and added faster fall-back version for incomp... | Cedric Nugteren |
2016-10-02 | Specialised the GEMM direct kernel in four ways for transposing/non-transposi... | Cedric Nugteren |
2016-10-02 | Split the GEMM direct kernel into two files; set the default tuning target to... | Cedric Nugteren |
2016-09-25 | Added a first version of a tuner for the GEMM direct kernel; collapsed MWGD, ... | Cedric Nugteren |
2016-09-25 | Separated the tuning parameters of the new direct GEMM kernel from the indire... | Cedric Nugteren |
2016-09-21 | Merge branch 'development' into gemm_direct | Cedric Nugteren |
2016-09-12 | Split the XGEMM kernel further up: now in 3 parts. This is done because MSVC ... | Cedric Nugteren |
2016-07-26 | Fixed issues related to the recent changes in the Xgemm infrastructure | Cedric Nugteren |
2016-07-26 | Merge branch 'development' into gemm_direct | Cedric Nugteren |
2016-07-25 | Moved the XgemvFast and XgemvFastRot tuning database into a separate file | Cedric Nugteren |
2016-07-24 | Merge branch 'development' into gemv_performance | Cedric Nugteren |
2016-07-23 | Improved the XgemvFastRot kernel by tiled loading of the input matrix A, enab... | Cedric Nugteren |
2016-07-22 | clblast::RunKernel, cl::Kernel: unify variants with/without waitForEvents, su... | Ivan Shapovalov |
2016-07-22 | clblast::RunKernel, cl::Kernel: take const vector as waitForEvents | Ivan Shapovalov |
2016-07-22 | xgemm: do not hardcode kernel requirements for internal matrix layout | Ivan Shapovalov |
2016-07-17 | Improved the GEMM direct kernel by adding register blocking. Still not fast t... | Cedric Nugteren |
2016-07-16 | Created infrastructure to support a direct GEMM kernel; added correct but slo... | Cedric Nugteren |
2016-07-16 | Removed an unused variable from the copy-transpose-pad function | Cedric Nugteren |
2016-07-10 | Now passing alpha/beta to the kernel as arguments as before fp16 support; in ... | Cedric Nugteren |
2016-07-06 | Added a VERBOSE mode to debug performance: now prints details about compilati... | Cedric Nugteren |
2016-06-27 | Fixes for the AppVeyor Windows build | Cedric Nugteren |
2016-06-19 | Renamed all C++ source files to .cpp to match the .hpp extension better | Cedric Nugteren |
2016-06-18 | Moved all headers into the source tree, changed headers to .hpp extension | Cedric Nugteren |
2016-06-18 | Clean-up of the routine class, moved RunKernel to the routine/common file | Cedric Nugteren |
2016-06-18 | Removed the template from the Routine base-class | Cedric Nugteren |
2016-06-17 | Removed the precision argument from the routines in favor of a single templat... | Cedric Nugteren |
2016-06-17 | Removed the interface to the cache functions from the Routine class, calls th... | Cedric Nugteren |
2016-06-17 | Moved the RunKernel and PadCopyTransposeMatrix functions out of the Routine c... | Cedric Nugteren |
2016-06-17 | Moved the test-for-valid-buffers function from the Routine class to separate ... | Cedric Nugteren |
2016-06-16 | Added XOMATCOPY routines to perform out-of-place matrix scaling, copying, and... | Cedric Nugteren |
2016-06-15 | Added some constness to variables related to the GEMM routines | Cedric Nugteren |
2016-06-14 | Re-organised the level-3 supporting kernels (copy, pad, transpose, convert) a... | Cedric Nugteren |
2016-05-25 | Added level-3 half-precision routines HGEMM/HSYMM/HSYRK/HSYR2K/HTRMM | Cedric Nugteren |
2016-05-22 | Added level-2 half-precision routines HGER/HSYR/HSPR/HSYR2/HSPR2 | Cedric Nugteren |