summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-05-23Fix API inconsistency in cupp11.hppCedric Nugteren
The function `CopyToAsync` has an optional event argument in the OpenCL version, which is used in CLBlast. This makes the code not compile at all if CUDA (through cupp11.hpp`) is used as backend. This issue was found by a CLBlast user and reported privately by email. This PR should fix that.
2022-05-17Merge pull request #437 from umar456/blas_fixCedric Nugteren
Add logic to find intel OpenMP on oneMKL.
2022-05-16Merge pull request #432 from justingra/sum-fixCedric Nugteren
sum fix
2022-05-15Add logic to find intel OpenMP on oneMKL.Umar Arshad
2022-05-13dev versionJustin Graham
2022-05-13changelog messageJustin Graham
2022-04-25Merge pull request #436 from CNugteren/add_tuning_resultsCedric Nugteren
Add tuning results for 2 AMD GPUs and 1 Qualcomm GPU
2022-04-25Add tuning results for Adreno 540Cedric Nugteren
2022-04-25Add tuning results for Radeon RX 6500 XTCedric Nugteren
2022-04-25Add tuning results for Radeon RX 6800 XTCedric Nugteren
2022-04-25Merge pull request #434 from CNugteren/update_test_status_machinesCedric Nugteren
Remove old test machines and add new ones
2022-04-25Remove old test machines and add new onesCedric Nugteren
2022-04-22sum fixJustin Graham
2022-04-14Merge pull request #431 from danyougle/patch-2Cedric Nugteren
android.hpp: custom header guard _clang_
2022-04-13android.hpp: custom header guard of _clang_danyougle
In order not to have ambiguous definitions, exclude the functions for other compilers
2022-04-13Merge pull request #430 from danyougle/patch-1Cedric Nugteren
add AMD OCL SDK light path in ENV section
2022-04-13add AMD OCL SDK light path in ENV sectiondanyougle
2021-08-27Merge pull request #425 from CNugteren/tesla_t4_correctnessCedric Nugteren
Tesla T4 tuning parameters
2021-08-27Add Quadro T2000 tuning parameters for the Tesla T4Cedric Nugteren
2021-08-27Remove Tesla T4 tuning resultsCedric Nugteren
2021-08-24Merge pull request #424 from gspr/gspr/prebuiltCedric Nugteren
Update documentation to reflect CLBlast in Debian & Ubuntu
2021-08-24PPA for older UbuntusGard Spreemann
2021-08-24Let the installation documentation reflect the fact that CLBlast is now in ↵Gard Spreemann
Debian and Ubuntu
2021-08-20Merge pull request #423 from CNugteren/new_tuning_resultsCedric Nugteren
New tuning results for 1 Intel CPU and 5 NVIDIA GPUs
2021-08-19Added a note on clock frequencies for tuningCedric Nugteren
2021-08-19Updated README and tuning listCedric Nugteren
2021-08-19Add tuning results for NVIDIA Tesla V100Cedric Nugteren
2021-08-19Add tuning results for NVIDIA Tesla T4Cedric Nugteren
2021-08-19Add tuning results for NVIDIA Quadro T2000Cedric Nugteren
2021-08-19Add tuning results for NVIDIA Quadro GV100Cedric Nugteren
2021-08-19Add tuning results for Intel Core i9-9980HKCedric Nugteren
2021-08-19Add tuning results for NVIDIA A100Cedric Nugteren
2021-05-23Merge pull request #419 from CNugteren/fix_tuner_out_of_bounds_accessCedric Nugteren
Fix tuner printing issue
2021-05-22Fix issue with printing out-of-bounds local/global sizes for level 1 tunersCedric Nugteren
2021-04-30Merge pull request #417 from gspr/gspr/capitalization-typoCedric Nugteren
Correct capitalization typo
2021-04-30Correct capitalization typoGard Spreemann
The CLBlastConfig.cmake file was installed to a directory named CLBLast (notice second capital l), which can cause issues for CMake's search path when looking for CLBlast on the system. This commit also fixes other occurrences of the wrong capitalization, all of it purely cosmetic (i.e. in comments).
2021-03-15Merge pull request #416 from JishinMaster/masterCedric Nugteren
set the correct flop count for xgemm
2021-03-13set the correct flop count for xgemmJishinMaster
2021-02-06Merge pull request #414 from CNugteren/CLBlast-412-python-runtime-libs-fixCedric Nugteren
Fix Windows paths in pyclblast
2021-02-05Fix Windows paths in pyclblastCedric Nugteren
2021-02-04Merge pull request #413 from CNugteren/CLBlast-412-python-runtime-libsCedric Nugteren
Add library dir on Linux for pyclblast
2021-02-04Added second Windows library pathCedric Nugteren
2021-01-30Add library path for Windows as wellCedric Nugteren
2021-01-29Add library dir on Linux for pyclblastCedric Nugteren
2021-01-21Update pyclblast package version numberCedric Nugteren
2021-01-21Merge pull request #410 from jamesjer/masterCedric Nugteren
Use reference types to prevent unnecessary copying
2021-01-20Use reference types to prevent unnecessary copyingJerry James
2021-01-19Updated to version 1.5.2Cedric Nugteren
2020-10-10Add tuning results for TITAN RTXCedric Nugteren
2020-10-10Add tuning results for Radeon RX VegaCedric Nugteren