Age | Commit message (Collapse) | Author | |
---|---|---|---|
2023-06-16 | Fix pointer error in `pyclblast` on ARM (#490) | Yubraj Bhoi | |
* Fix pointer error in `pyclblast` on ARM Use `ptrdiff_t` instead of `size_t` for pointers. Fix error in `setup.py` * Fix ARM pointer error in `pyclblast` generator Update CHANGELOG file | |||
2023-05-07 | AMAX/AMIN integer testing and bug fixes (#457) | Cedric Nugteren | |
* Fixed a bug in XAMAX/XMIN routines that caused the increment and offset to be included in the result * Perform proper integer-output testing in XAMAX tests * A few changes towards getting it ready for a PR * Also fix compilation for clBLAS and cuBLAS references * Fix a bug that would only use the real part of complex numbers in the amax/amin routines * A few small fixes related to the AMAX tests | |||
2023-03-25 | Fix documentation bug w.r.t. ld values and matrix layout | Cedric Nugteren | |
2022-10-13 | Fix plotting issue with a single row or column | Cedric Nugteren | |
2022-10-13 | Fix plotting issue in case of 'inf' values | Cedric Nugteren | |
2020-10-05 | Allow single graph / subplot on plot | Witold Baryluk | |
`plt.subplots` tries to be special, and return array or not-array depending on a number of subplots. It is not actually helpful, and IMHO bad design. Make it always `ndarray`. The `and not type(axes) is np.ndarray`, is just in case matplotlib decides to make their behavior more uniform. For now work around it. Also, no need for `ndarray.flat` really. Confirmed to work with existing benchmarks (i.e. rows=2, cols=3), and with single graphs (rows=1, cols=1). | |||
2020-10-04 | Fix a typo in benchmark when running fp 16 vs 32 | Witold Baryluk | |
The intention here was to limit the iteration range to common indexes only. Fix that. | |||
2020-10-04 | Merge pull request #397 from baryluk/patch-1 | Cedric Nugteren | |
Fix Python SyntaxWarning | |||
2020-10-04 | Fix --load_from_disk argument help message | Witold Baryluk | |
2020-10-04 | Fix Python SyntaxWarning | Witold Baryluk | |
There is no guarantee that all empty strings objects are the same or share object with `""` literal. | |||
2020-10-02 | Fix a Python 3 bug in the benchmark script | Cedric Nugteren | |
2020-05-12 | Added CLBLAST_VERSION_MAJOR/MINOR/PATCH defines in headers to store version ↵ | Cedric Nugteren | |
numbering | |||
2020-05-10 | Added pyclblast bindings for the 3 batched routines | Cedric Nugteren | |
2020-03-08 | Update API documentation | Cedric Nugteren | |
2019-01-23 | Added fp32 to fp16 conversion function in Python to make haxpy example work | Cedric Nugteren | |
2019-01-22 | Added experimental support for half-precision in pyclblast | Cedric Nugteren | |
2018-12-31 | Added support for the convgemm tuner in the tuner database | Cedric Nugteren | |
2018-11-12 | Add kernel_mode option to im2col, col2im, and convgemm functions | Koichi Akabe | |
2018-11-07 | Changed col2im to append to the existing im-buffer | Cedric Nugteren | |
2018-10-23 | Added groundwork for col2im algorithm plus first non-working version of ↵ | Cedric Nugteren | |
kernel and test | |||
2018-09-16 | Merge branch 'master' into convgemm_multi_kernel | Cedric Nugteren | |
2018-08-05 | Added an option to compile the Netlib API with static OpenCL device and context | Cedric Nugteren | |
2018-07-29 | Removed complex numbers support for CONVGEMM | Cedric Nugteren | |
2018-07-29 | Merge branch 'master' into CLBlast-267-convgemm | Cedric Nugteren | |
2018-07-13 | Added tuning results for HD Graphics 6000 Broadwell GT3 | Cedric Nugteren | |
2018-05-09 | Updated the documentation for convgemm to include data layout (NCHW) | Cedric Nugteren | |
2018-05-06 | Added convgemm skeleton, test infrastructure, and first reference implementation | Cedric Nugteren | |
2018-05-05 | Added interface of batched convolution as GEMM | Cedric Nugteren | |
2018-04-15 | Updated tuning results for the Skylake ULT GT2 GPU with the new kernel | Cedric Nugteren | |
2018-04-10 | Made it possible to add tuning parameters to the database using the script | Cedric Nugteren | |
2018-04-10 | Fixed a bug in the compression part of the database script | Cedric Nugteren | |
2018-04-08 | Extended the maximum number of tuning parameters from 14 to 16 | Cedric Nugteren | |
2018-04-07 | Fixed a python3 import error issue with the database script | Cedric Nugteren | |
2018-03-27 | merged | kodonell | |
2018-03-27 | got the generator thing working | kodonell | |
2018-03-11 | Merge pull request #262 from CNugteren/CLBlast-237-tuning-api | Cedric Nugteren | |
CLBlast #237: Tuning API | |||
2018-03-10 | Made benchmarking script also work for complex numbers | Cedric Nugteren | |
2018-03-10 | Updated the documentation for the tuner API | Cedric Nugteren | |
2018-03-10 | Fixed a few things for the new tuning API | Cedric Nugteren | |
2018-03-03 | Fixed some small issues regarding PR#253 | Cedric Nugteren | |
2018-03-03 | Added C API for getting GEMM temp buffer size | sivagnanamn | |
2018-02-25 | Generated PyCLBlast docstrings | Cedric Nugteren | |
2018-02-25 | Some style improvements in the pyclblast code generator | Cedric Nugteren | |
2018-02-25 | Added API documentation for two missing C++ functions | Cedric Nugteren | |
2018-02-24 | Renamed the API documentation | Cedric Nugteren | |
2018-02-21 | Fixed duplication of parameter descriptions by the doc generator | Kirill Mavreshko | |
2018-02-18 | Prepared PyCLBlast for release as a package on PyPi | Cedric Nugteren | |
2018-02-18 | Added all other level 1/2/3 routines to pyclblast | Cedric Nugteren | |
2018-02-18 | Added GEMM to the Python wrapper | Cedric Nugteren | |
2018-02-14 | First agenerated version (clblastXswap only for now) of the pyclblast wrapper | Cedric Nugteren | |