summaryrefslogtreecommitdiff
path: root/CHANGELOG
blob: 0fee63af922fc7f1be5425f39e448dd68fd6cf80 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
Development version (next release)
- Re-organized test/client infrastructure to avoid code duplication
- Added an optional bypass for pre/post-processing kernels in level-3 routines
- Significantly improved performance of level-3 routines on AMD GPUs
- Added level-3 routines:
  * CHEMM/ZHEMM
  * SSYRK/DSYRK/CSYRK/ZSYRK
  * CHERK/ZHERK
  * SSYR2K/DSYR2K/CSYR2K/ZSYR2K
  * CHER2K/ZHER2K
  * STRMM/DTRMM/CTRMM/ZTRMM

Version 0.2.0
- Added support for complex conjugate transpose
- Several host-code performance improvements
- Improved testing infrastructure and coverage
- Added level-2 routines:
  * SGEMV/DGEMV/CGEMV/ZGEMV
- Added level-3 routines:
  * CGEMM/ZGEMM
  * CSYMM/ZSYMM

Version 0.1.0
- Initial preview version release to GitHub
- Supported level-1 routines:
  * SAXPY/DAXPY/CAXPY/ZAXPY
- Supported level-3 routines:
  * SGEMM/DGEMM
  * SSYMM/DSYMM