Age | Commit message (Collapse) | Author | |
---|---|---|---|
2021-08-27 | Merge pull request #425 from CNugteren/tesla_t4_correctness | Cedric Nugteren | |
Tesla T4 tuning parameters | |||
2021-08-27 | Add Quadro T2000 tuning parameters for the Tesla T4 | Cedric Nugteren | |
2021-08-27 | Remove Tesla T4 tuning results | Cedric Nugteren | |
2021-08-24 | Merge pull request #424 from gspr/gspr/prebuilt | Cedric Nugteren | |
Update documentation to reflect CLBlast in Debian & Ubuntu | |||
2021-08-24 | PPA for older Ubuntus | Gard Spreemann | |
2021-08-24 | Let the installation documentation reflect the fact that CLBlast is now in ↵ | Gard Spreemann | |
Debian and Ubuntu | |||
2021-08-20 | Merge pull request #423 from CNugteren/new_tuning_results | Cedric Nugteren | |
New tuning results for 1 Intel CPU and 5 NVIDIA GPUs | |||
2021-08-19 | Added a note on clock frequencies for tuning | Cedric Nugteren | |
2021-08-19 | Updated README and tuning list | Cedric Nugteren | |
2021-08-19 | Add tuning results for NVIDIA Tesla V100 | Cedric Nugteren | |
2021-08-19 | Add tuning results for NVIDIA Tesla T4 | Cedric Nugteren | |
2021-08-19 | Add tuning results for NVIDIA Quadro T2000 | Cedric Nugteren | |
2021-08-19 | Add tuning results for NVIDIA Quadro GV100 | Cedric Nugteren | |
2021-08-19 | Add tuning results for Intel Core i9-9980HK | Cedric Nugteren | |
2021-08-19 | Add tuning results for NVIDIA A100 | Cedric Nugteren | |
2021-05-23 | Merge pull request #419 from CNugteren/fix_tuner_out_of_bounds_access | Cedric Nugteren | |
Fix tuner printing issue | |||
2021-05-22 | Fix issue with printing out-of-bounds local/global sizes for level 1 tuners | Cedric Nugteren | |
2021-04-30 | Merge pull request #417 from gspr/gspr/capitalization-typo | Cedric Nugteren | |
Correct capitalization typo | |||
2021-04-30 | Correct capitalization typo | Gard Spreemann | |
The CLBlastConfig.cmake file was installed to a directory named CLBLast (notice second capital l), which can cause issues for CMake's search path when looking for CLBlast on the system. This commit also fixes other occurrences of the wrong capitalization, all of it purely cosmetic (i.e. in comments). | |||
2021-03-15 | Merge pull request #416 from JishinMaster/master | Cedric Nugteren | |
set the correct flop count for xgemm | |||
2021-03-13 | set the correct flop count for xgemm | JishinMaster | |
2021-02-06 | Merge pull request #414 from CNugteren/CLBlast-412-python-runtime-libs-fix | Cedric Nugteren | |
Fix Windows paths in pyclblast | |||
2021-02-05 | Fix Windows paths in pyclblast | Cedric Nugteren | |
2021-02-04 | Merge pull request #413 from CNugteren/CLBlast-412-python-runtime-libs | Cedric Nugteren | |
Add library dir on Linux for pyclblast | |||
2021-02-04 | Added second Windows library path | Cedric Nugteren | |
2021-01-30 | Add library path for Windows as well | Cedric Nugteren | |
2021-01-29 | Add library dir on Linux for pyclblast | Cedric Nugteren | |
2021-01-21 | Update pyclblast package version number | Cedric Nugteren | |
2021-01-21 | Merge pull request #410 from jamesjer/master | Cedric Nugteren | |
Use reference types to prevent unnecessary copying | |||
2021-01-20 | Use reference types to prevent unnecessary copying | Jerry James | |
2021-01-19 | Updated to version 1.5.2 | Cedric Nugteren | |
2020-10-10 | Add tuning results for TITAN RTX | Cedric Nugteren | |
2020-10-10 | Add tuning results for Radeon RX Vega | Cedric Nugteren | |
2020-10-05 | Merge pull request #400 from baryluk/patch-6 | Cedric Nugteren | |
Allow single graph / subplot on plot | |||
2020-10-05 | Allow single graph / subplot on plot | Witold Baryluk | |
`plt.subplots` tries to be special, and return array or not-array depending on a number of subplots. It is not actually helpful, and IMHO bad design. Make it always `ndarray`. The `and not type(axes) is np.ndarray`, is just in case matplotlib decides to make their behavior more uniform. For now work around it. Also, no need for `ndarray.flat` really. Confirmed to work with existing benchmarks (i.e. rows=2, cols=3), and with single graphs (rows=1, cols=1). | |||
2020-10-04 | Merge pull request #399 from baryluk/patch-3 | Cedric Nugteren | |
Fix a typo in benchmark when running fp 16 vs 32 | |||
2020-10-04 | Fix a typo in benchmark when running fp 16 vs 32 | Witold Baryluk | |
The intention here was to limit the iteration range to common indexes only. Fix that. | |||
2020-10-04 | Merge pull request #397 from baryluk/patch-1 | Cedric Nugteren | |
Fix Python SyntaxWarning | |||
2020-10-04 | Merge pull request #398 from baryluk/patch-2 | Cedric Nugteren | |
Fix --load_from_disk argument help message | |||
2020-10-04 | Fix --load_from_disk argument help message | Witold Baryluk | |
2020-10-04 | Fix Python SyntaxWarning | Witold Baryluk | |
There is no guarantee that all empty strings objects are the same or share object with `""` literal. | |||
2020-10-03 | Merge pull request #396 from CNugteren/CLBlast-395-fix-benchmark-script | Cedric Nugteren | |
Fix a Python 3 bug in the benchmark script | |||
2020-10-02 | Fix a Python 3 bug in the benchmark script | Cedric Nugteren | |
2020-08-16 | Added FUNDING.yml file | Cedric Nugteren | |
2020-06-07 | Merge pull request #392 from 9prady9/fix_Program_getIR | Cedric Nugteren | |
Fix Program::GetIR to handle programs with multiple devices | |||
2020-06-07 | Add a cautionary note in Program::GetIR and mention the fix in CHANGELOG | Pradeep Garigipati | |
2020-06-05 | Fix Program::GetIR to handle programs with multiple devices | Pradeep Garigipati | |
2020-05-13 | Merge pull request #389 from CNugteren/CLBlast-385-version-defines | Cedric Nugteren | |
Added version number defines | |||
2020-05-12 | Added CLBLAST_VERSION_MAJOR/MINOR/PATCH defines in headers to store version ↵ | Cedric Nugteren | |
numbering | |||
2020-05-11 | Merge pull request #388 from CNugteren/CLBlast-381-gemm-direct-tuner-failure | Cedric Nugteren | |
Fixed tuners global workgroup size |