From 97e92cb10ce4b12109966754db4190b6565fcaa1 Mon Sep 17 00:00:00 2001 From: Cedric Nugteren Date: Sun, 28 Jan 2018 14:50:03 +0100 Subject: Updated the known issues --- README.md | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) (limited to 'README.md') diff --git a/README.md b/README.md index db7e16e8..4fc6044d 100644 --- a/README.md +++ b/README.md @@ -305,10 +305,11 @@ CLBlast supports almost all the Netlib BLAS routines plus a couple of extra non- Furthermore, there are also batched versions of BLAS routines available, processing multiple smaller computations in one go for better performance: -| Batched | S | D | C | Z | H | -| -------------|---|---|---|---|---| -| xAXPYBATCHED | ✔ | ✔ | ✔ | ✔ | ✔ | -| xGEMMBATCHED | ✔ | ✔ | ✔ | ✔ | ✔ | +| Batched | S | D | C | Z | H | +| --------------------|---|---|---|---|---| +| xAXPYBATCHED | ✔ | ✔ | ✔ | ✔ | ✔ | +| xGEMMBATCHED | ✔ | ✔ | ✔ | ✔ | ✔ | +| xGEMMSTRIDEDBATCHED | ✔ | ✔ | ✔ | ✔ | ✔ | In addition, some extra non-BLAS routines are also supported by CLBlast, classified as level-X. They are experimental and should be used with care: @@ -378,6 +379,10 @@ Other known issues: * Half-precision FP16 tests might sometimes fail based on order multiplication, i.e. (a * b) * c != (c * b) * a +* The AMD APP SDK has a bug causing a conflict with libstdc++, resulting in a segfault when initialising static variables. This has been reported to occur with the CLBlast tuners. + +* The AMD run-time compiler has a bug causing it to get stuck in an infinite loop. This is reported to happen occasionally when tuning the CLBlast GEMM routine. + Contributing ------------- -- cgit v1.2.3