Age | Commit message (Expand) | Author |
---|---|---|
2018-07-14 | Applied feedback from Cedric from first pull request | Tyler Sorensen |
2018-07-11 | added inline ptx to support shuffle on Nvidia GPUs | Tyler Sorensen |
2018-05-01 | Now stores a shared_ptr to the Program class in the cache | Cedric Nugteren |
2018-04-24 | Added a define to enable subgroup shuffling if supported by the device | Cedric Nugteren |
2017-12-24 | Fixes for the CUDA backend of CLBlast | Cedric Nugteren |
2017-12-17 | Removed all ARM Mali tuning results; re-added Mali-T760 and Mali-T628 results... | Cedric Nugteren |
2017-12-09 | Made the pre-processor run by default for ARM and Qualcomm GPUs | Cedric Nugteren |
2017-11-30 | Integrated pre-processor in compilation flow, default is still disabled | Cedric Nugteren |
2017-11-19 | Added compilation timing and better compilation error reporting | Cedric Nugteren |
2017-11-17 | Moved compilation function to separate file; removed dependency of tuners of ... | Cedric Nugteren |