Age | Commit message (Collapse) | Author | |
---|---|---|---|
2023-01-21 | Add tuning results for Intel FPGA emulation device | Cedric Nugteren | |
2023-01-21 | Add tuning results for Radeon Pro 450 | Cedric Nugteren | |
2023-01-21 | Add tuning results for Adreno 740 | Cedric Nugteren | |
2023-01-21 | Add tuning results for Adreno 730 | Cedric Nugteren | |
2023-01-21 | Merge pull request #451 from CodeLinaro/master | Cedric Nugteren | |
CLBlast modifications to address Qualcomm Adreno performance | |||
2023-01-17 | Updated according to feedback from CNugteren | Angus, Alexander | |
2023-01-12 | Adreno 730 + 740 CLBlast tuning results | Angus, Alexander | |
2023-01-03 | implemented changes to boost Adreno performance according to ↵ | Angus, Alexander | |
https://jira-dc.qualcomm.com/jira/browse/OSR-8731 | |||
2022-10-14 | Merge pull request #447 from CNugteren/small_plotting_fixes | Cedric Nugteren | |
Fix two small issues in the plotting script | |||
2022-10-13 | Update changelog | Cedric Nugteren | |
2022-10-13 | Fix plotting issue with a single row or column | Cedric Nugteren | |
2022-10-13 | Fix plotting issue in case of 'inf' values | Cedric Nugteren | |
2022-09-27 | Merge pull request #442 from CNugteren/update_version_to_1_5_3 | Cedric Nugteren | |
Update to version 1.5.3 | |||
2022-09-27 | Fix opencl.hpp download in CMake | Cedric Nugteren | |
2022-09-27 | Properly set OpenCL target to version 2.1 | Cedric Nugteren | |
2022-09-22 | Replace the broken khronos registry link for cl.hpp with a new github link ↵ | Cedric Nugteren | |
for opencl.hpp | |||
2022-09-22 | Update PyCLBlast version number | Cedric Nugteren | |
2022-09-22 | Update to version 1.5.3 | Cedric Nugteren | |
2022-06-24 | Fix typo in comment | Cedric Nugteren | |
Resolves https://github.com/CNugteren/CLBlast/issues/440 | |||
2022-05-25 | Merge pull request #438 from CNugteren/cupp11_api_inconsistency | Cedric Nugteren | |
Fix API inconsistency in cupp11.hpp | |||
2022-05-23 | Fix API inconsistency in cupp11.hpp | Cedric Nugteren | |
The function `CopyToAsync` has an optional event argument in the OpenCL version, which is used in CLBlast. This makes the code not compile at all if CUDA (through cupp11.hpp`) is used as backend. This issue was found by a CLBlast user and reported privately by email. This PR should fix that. | |||
2022-05-17 | Merge pull request #437 from umar456/blas_fix | Cedric Nugteren | |
Add logic to find intel OpenMP on oneMKL. | |||
2022-05-16 | Merge pull request #432 from justingra/sum-fix | Cedric Nugteren | |
sum fix | |||
2022-05-15 | Add logic to find intel OpenMP on oneMKL. | Umar Arshad | |
2022-05-13 | dev version | Justin Graham | |
2022-05-13 | changelog message | Justin Graham | |
2022-04-25 | Merge pull request #436 from CNugteren/add_tuning_results | Cedric Nugteren | |
Add tuning results for 2 AMD GPUs and 1 Qualcomm GPU | |||
2022-04-25 | Add tuning results for Adreno 540 | Cedric Nugteren | |
2022-04-25 | Add tuning results for Radeon RX 6500 XT | Cedric Nugteren | |
2022-04-25 | Add tuning results for Radeon RX 6800 XT | Cedric Nugteren | |
2022-04-25 | Merge pull request #434 from CNugteren/update_test_status_machines | Cedric Nugteren | |
Remove old test machines and add new ones | |||
2022-04-25 | Remove old test machines and add new ones | Cedric Nugteren | |
2022-04-22 | sum fix | Justin Graham | |
2022-04-14 | Merge pull request #431 from danyougle/patch-2 | Cedric Nugteren | |
android.hpp: custom header guard _clang_ | |||
2022-04-13 | android.hpp: custom header guard of _clang_ | danyougle | |
In order not to have ambiguous definitions, exclude the functions for other compilers | |||
2022-04-13 | Merge pull request #430 from danyougle/patch-1 | Cedric Nugteren | |
add AMD OCL SDK light path in ENV section | |||
2022-04-13 | add AMD OCL SDK light path in ENV section | danyougle | |
2021-08-27 | Merge pull request #425 from CNugteren/tesla_t4_correctness | Cedric Nugteren | |
Tesla T4 tuning parameters | |||
2021-08-27 | Add Quadro T2000 tuning parameters for the Tesla T4 | Cedric Nugteren | |
2021-08-27 | Remove Tesla T4 tuning results | Cedric Nugteren | |
2021-08-24 | Merge pull request #424 from gspr/gspr/prebuilt | Cedric Nugteren | |
Update documentation to reflect CLBlast in Debian & Ubuntu | |||
2021-08-24 | PPA for older Ubuntus | Gard Spreemann | |
2021-08-24 | Let the installation documentation reflect the fact that CLBlast is now in ↵ | Gard Spreemann | |
Debian and Ubuntu | |||
2021-08-20 | Merge pull request #423 from CNugteren/new_tuning_results | Cedric Nugteren | |
New tuning results for 1 Intel CPU and 5 NVIDIA GPUs | |||
2021-08-19 | Added a note on clock frequencies for tuning | Cedric Nugteren | |
2021-08-19 | Updated README and tuning list | Cedric Nugteren | |
2021-08-19 | Add tuning results for NVIDIA Tesla V100 | Cedric Nugteren | |
2021-08-19 | Add tuning results for NVIDIA Tesla T4 | Cedric Nugteren | |
2021-08-19 | Add tuning results for NVIDIA Quadro T2000 | Cedric Nugteren | |
2021-08-19 | Add tuning results for NVIDIA Quadro GV100 | Cedric Nugteren | |