Age | Commit message (Expand) | Author |
---|---|---|
2017-04-17 | Fixed a namespace clash with CUDA FP16 for the half-datatype | Cedric Nugteren |
2017-03-14 | Added the possibility to tune batched kernels | Cedric Nugteren |
2016-11-27 | Made it possible to use the command-line environmental variables for each exe... | Cedric Nugteren |
2016-10-22 | Moved files around a bit; created a utilities subfolder | Cedric Nugteren |
2016-10-03 | Re-organised GEMM direct kernel and added faster fall-back version for incomp... | Cedric Nugteren |
2016-10-02 | Specialised the GEMM direct kernel in four ways for transposing/non-transposi... | Cedric Nugteren |
2016-10-02 | Split the GEMM direct kernel into two files; set the default tuning target to... | Cedric Nugteren |
2016-10-01 | Added padding to the local memory of the GEMM direct kernel | Cedric Nugteren |
2016-10-01 | Added default num-runs to the tuner adding averaging over 10 runs as a defaul... | Cedric Nugteren |
2016-09-25 | Added a first version of a tuner for the GEMM direct kernel; collapsed MWGD, ... | Cedric Nugteren |