summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-08-24PPA for older UbuntusGard Spreemann
2021-08-24Let the installation documentation reflect the fact that CLBlast is now in ↵Gard Spreemann
Debian and Ubuntu
2021-08-20Merge pull request #423 from CNugteren/new_tuning_resultsCedric Nugteren
New tuning results for 1 Intel CPU and 5 NVIDIA GPUs
2021-08-19Added a note on clock frequencies for tuningCedric Nugteren
2021-08-19Updated README and tuning listCedric Nugteren
2021-08-19Add tuning results for NVIDIA Tesla V100Cedric Nugteren
2021-08-19Add tuning results for NVIDIA Tesla T4Cedric Nugteren
2021-08-19Add tuning results for NVIDIA Quadro T2000Cedric Nugteren
2021-08-19Add tuning results for NVIDIA Quadro GV100Cedric Nugteren
2021-08-19Add tuning results for Intel Core i9-9980HKCedric Nugteren
2021-08-19Add tuning results for NVIDIA A100Cedric Nugteren
2021-05-23Merge pull request #419 from CNugteren/fix_tuner_out_of_bounds_accessCedric Nugteren
Fix tuner printing issue
2021-05-22Fix issue with printing out-of-bounds local/global sizes for level 1 tunersCedric Nugteren
2021-04-30Merge pull request #417 from gspr/gspr/capitalization-typoCedric Nugteren
Correct capitalization typo
2021-04-30Correct capitalization typoGard Spreemann
The CLBlastConfig.cmake file was installed to a directory named CLBLast (notice second capital l), which can cause issues for CMake's search path when looking for CLBlast on the system. This commit also fixes other occurrences of the wrong capitalization, all of it purely cosmetic (i.e. in comments).
2021-03-15Merge pull request #416 from JishinMaster/masterCedric Nugteren
set the correct flop count for xgemm
2021-03-13set the correct flop count for xgemmJishinMaster
2021-02-06Merge pull request #414 from CNugteren/CLBlast-412-python-runtime-libs-fixCedric Nugteren
Fix Windows paths in pyclblast
2021-02-05Fix Windows paths in pyclblastCedric Nugteren
2021-02-04Merge pull request #413 from CNugteren/CLBlast-412-python-runtime-libsCedric Nugteren
Add library dir on Linux for pyclblast
2021-02-04Added second Windows library pathCedric Nugteren
2021-01-30Add library path for Windows as wellCedric Nugteren
2021-01-29Add library dir on Linux for pyclblastCedric Nugteren
2021-01-21Update pyclblast package version numberCedric Nugteren
2021-01-21Merge pull request #410 from jamesjer/masterCedric Nugteren
Use reference types to prevent unnecessary copying
2021-01-20Use reference types to prevent unnecessary copyingJerry James
2021-01-19Updated to version 1.5.2Cedric Nugteren
2020-10-10Add tuning results for TITAN RTXCedric Nugteren
2020-10-10Add tuning results for Radeon RX VegaCedric Nugteren
2020-10-05Merge pull request #400 from baryluk/patch-6Cedric Nugteren
Allow single graph / subplot on plot
2020-10-05Allow single graph / subplot on plotWitold Baryluk
`plt.subplots` tries to be special, and return array or not-array depending on a number of subplots. It is not actually helpful, and IMHO bad design. Make it always `ndarray`. The `and not type(axes) is np.ndarray`, is just in case matplotlib decides to make their behavior more uniform. For now work around it. Also, no need for `ndarray.flat` really. Confirmed to work with existing benchmarks (i.e. rows=2, cols=3), and with single graphs (rows=1, cols=1).
2020-10-04Merge pull request #399 from baryluk/patch-3Cedric Nugteren
Fix a typo in benchmark when running fp 16 vs 32
2020-10-04Fix a typo in benchmark when running fp 16 vs 32Witold Baryluk
The intention here was to limit the iteration range to common indexes only. Fix that.
2020-10-04Merge pull request #397 from baryluk/patch-1Cedric Nugteren
Fix Python SyntaxWarning
2020-10-04Merge pull request #398 from baryluk/patch-2Cedric Nugteren
Fix --load_from_disk argument help message
2020-10-04Fix --load_from_disk argument help messageWitold Baryluk
2020-10-04Fix Python SyntaxWarningWitold Baryluk
There is no guarantee that all empty strings objects are the same or share object with `""` literal.
2020-10-03Merge pull request #396 from CNugteren/CLBlast-395-fix-benchmark-scriptCedric Nugteren
Fix a Python 3 bug in the benchmark script
2020-10-02Fix a Python 3 bug in the benchmark scriptCedric Nugteren
2020-08-16Added FUNDING.yml fileCedric Nugteren
2020-06-07Merge pull request #392 from 9prady9/fix_Program_getIRCedric Nugteren
Fix Program::GetIR to handle programs with multiple devices
2020-06-07Add a cautionary note in Program::GetIR and mention the fix in CHANGELOGPradeep Garigipati
2020-06-05Fix Program::GetIR to handle programs with multiple devicesPradeep Garigipati
2020-05-13Merge pull request #389 from CNugteren/CLBlast-385-version-definesCedric Nugteren
Added version number defines
2020-05-12Added CLBLAST_VERSION_MAJOR/MINOR/PATCH defines in headers to store version ↵Cedric Nugteren
numbering
2020-05-11Merge pull request #388 from CNugteren/CLBlast-381-gemm-direct-tuner-failureCedric Nugteren
Fixed tuners global workgroup size
2020-05-11Increase display width of the local/global sizesCedric Nugteren
2020-05-10Made sure that the global workgroup size is a multiple of the local size in ↵Cedric Nugteren
the tuners
2020-05-10Added logging of local/global workgroup sizes when run the tunersCedric Nugteren
2020-05-10Merge pull request #386 from CNugteren/CLBlast-384-pyclblast-missing-routinesCedric Nugteren
PyCLBlast: add missing batched routines