summaryrefslogtreecommitdiff
path: root/src
AgeCommit message (Expand)Author
2023-01-21Add tuning results for Adreno 730Cedric Nugteren
2023-01-17Updated according to feedback from CNugterenAngus, Alexander
2023-01-03implemented changes to boost Adreno performance according to https://jira-dc....Angus, Alexander
2022-09-22Update PyCLBlast version numberCedric Nugteren
2022-06-24Fix typo in commentCedric Nugteren
2022-05-23Fix API inconsistency in cupp11.hppCedric Nugteren
2022-05-16Merge pull request #432 from justingra/sum-fixCedric Nugteren
2022-04-25Add tuning results for Adreno 540Cedric Nugteren
2022-04-25Add tuning results for Radeon RX 6500 XTCedric Nugteren
2022-04-25Add tuning results for Radeon RX 6800 XTCedric Nugteren
2022-04-22sum fixJustin Graham
2022-04-13android.hpp: custom header guard of _clang_danyougle
2021-08-27Add Quadro T2000 tuning parameters for the Tesla T4Cedric Nugteren
2021-08-27Remove Tesla T4 tuning resultsCedric Nugteren
2021-08-19Add tuning results for NVIDIA Tesla V100Cedric Nugteren
2021-08-19Add tuning results for NVIDIA Tesla T4Cedric Nugteren
2021-08-19Add tuning results for NVIDIA Quadro T2000Cedric Nugteren
2021-08-19Add tuning results for NVIDIA Quadro GV100Cedric Nugteren
2021-08-19Add tuning results for Intel Core i9-9980HKCedric Nugteren
2021-08-19Add tuning results for NVIDIA A100Cedric Nugteren
2021-05-22Fix issue with printing out-of-bounds local/global sizes for level 1 tunersCedric Nugteren
2021-03-13set the correct flop count for xgemmJishinMaster
2021-02-05Fix Windows paths in pyclblastCedric Nugteren
2021-02-04Added second Windows library pathCedric Nugteren
2021-01-30Add library path for Windows as wellCedric Nugteren
2021-01-29Add library dir on Linux for pyclblastCedric Nugteren
2021-01-21Update pyclblast package version numberCedric Nugteren
2021-01-20Use reference types to prevent unnecessary copyingJerry James
2020-10-10Add tuning results for TITAN RTXCedric Nugteren
2020-10-10Add tuning results for Radeon RX VegaCedric Nugteren
2020-06-07Add a cautionary note in Program::GetIR and mention the fix in CHANGELOGPradeep Garigipati
2020-06-05Fix Program::GetIR to handle programs with multiple devicesPradeep Garigipati
2020-05-11Increase display width of the local/global sizesCedric Nugteren
2020-05-10Made sure that the global workgroup size is a multiple of the local size in t...Cedric Nugteren
2020-05-10Added logging of local/global workgroup sizes when run the tunersCedric Nugteren
2020-05-10Updated PyCLBlast version numberCedric Nugteren
2020-05-10Added a sample to demonstrate a batched routineCedric Nugteren
2020-05-10Added pyclblast bindings for the 3 batched routinesCedric Nugteren
2020-05-03Move queue creation out of the tuner loopCedric Nugteren
2020-03-08Made it more likely (but no guarantees) for amax/amin to return the first indexCedric Nugteren
2020-03-08Silenced a new OpenCL warning messageCedric Nugteren
2020-02-17Catches all exceptions of the tunersCedric Nugteren
2019-12-09Reduce TestMatrix calls for xgemmstridedbatched.Tarmo Räntilä
2019-12-09Reduce TestMatrix calls for xgemmbatched.Tarmo Räntilä
2019-09-04Fix out-of-bounds read/write in XhadFasteretomzak
2019-05-19Fixed a bug in the absolute-min index kernelCedric Nugteren
2019-05-11Added a function to set the OpenCL kernel standard, either 1.1 or 1.2Cedric Nugteren
2019-05-08Changed back to cl_intel_subgroups as suggestedCedric Nugteren
2019-05-07Added a host-code check to make sure the avc_motion_estimation is availableCedric Nugteren
2019-05-07Enabled avc_motion_estimation extension for Intel subgroup shufflingCedric Nugteren