diff options
author | Cedric Nugteren <web@cedricnugteren.nl> | 2018-01-28 14:50:03 +0100 |
---|---|---|
committer | Cedric Nugteren <web@cedricnugteren.nl> | 2018-01-28 14:50:03 +0100 |
commit | 97e92cb10ce4b12109966754db4190b6565fcaa1 (patch) | |
tree | b97acc3215f193372515e2ec12bdc7781db9b881 /README.md | |
parent | 180532ea398891c2366b8ca6cfcc215208528f2c (diff) |
Updated the known issues
Diffstat (limited to 'README.md')
-rw-r--r-- | README.md | 13 |
1 files changed, 9 insertions, 4 deletions
@@ -305,10 +305,11 @@ CLBlast supports almost all the Netlib BLAS routines plus a couple of extra non- Furthermore, there are also batched versions of BLAS routines available, processing multiple smaller computations in one go for better performance: -| Batched | S | D | C | Z | H | -| -------------|---|---|---|---|---| -| xAXPYBATCHED | ✔ | ✔ | ✔ | ✔ | ✔ | -| xGEMMBATCHED | ✔ | ✔ | ✔ | ✔ | ✔ | +| Batched | S | D | C | Z | H | +| --------------------|---|---|---|---|---| +| xAXPYBATCHED | ✔ | ✔ | ✔ | ✔ | ✔ | +| xGEMMBATCHED | ✔ | ✔ | ✔ | ✔ | ✔ | +| xGEMMSTRIDEDBATCHED | ✔ | ✔ | ✔ | ✔ | ✔ | In addition, some extra non-BLAS routines are also supported by CLBlast, classified as level-X. They are experimental and should be used with care: @@ -378,6 +379,10 @@ Other known issues: * Half-precision FP16 tests might sometimes fail based on order multiplication, i.e. (a * b) * c != (c * b) * a +* The AMD APP SDK has a bug causing a conflict with libstdc++, resulting in a segfault when initialising static variables. This has been reported to occur with the CLBlast tuners. + +* The AMD run-time compiler has a bug causing it to get stuck in an infinite loop. This is reported to happen occasionally when tuning the CLBlast GEMM routine. + Contributing ------------- |