summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--README.md2
-rw-r--r--ROADMAP.md12
2 files changed, 13 insertions, 1 deletions
diff --git a/README.md b/README.md
index 65b4818e..0232c3f3 100644
--- a/README.md
+++ b/README.md
@@ -10,7 +10,7 @@ CLBlast: The tuned OpenCL BLAS library
CLBlast is a modern, lightweight, performant and tunable OpenCL BLAS library written in C++11. It is designed to leverage the full performance potential of a wide variety of OpenCL devices from different vendors, including desktop and laptop GPUs, embedded GPUs, and other accelerators. CLBlast implements BLAS routines: basic linear algebra subprograms operating on vectors and matrices. See [the CLBlast website](https://cnugteren.github.io/clblast) for performance reports on various devices as well as the latest CLBlast news.
-The library is not tuned for all possible OpenCL devices: __if out-of-the-box performance is poor, please run the tuners first__. See below for a list of already tuned devices and instructions on how to tune yourself and contribute to future releases of the CLBlast library.
+The library is not tuned for all possible OpenCL devices: __if out-of-the-box performance is poor, please run the tuners first__. See below for a list of already tuned devices and instructions on how to tune yourself and contribute to future releases of the CLBlast library. See also the [CLBlast feature roadmap](ROADMAP.md) to get an indication of the future of CLBlast.
Why CLBlast and not clBLAS or cuBLAS?
diff --git a/ROADMAP.md b/ROADMAP.md
new file mode 100644
index 00000000..07fb1ed2
--- /dev/null
+++ b/ROADMAP.md
@@ -0,0 +1,12 @@
+CLBlast feature road-map
+================
+
+This file gives an overview of the main features planned for addition to CLBlast. A first-order indication time-frame for development time is provided:
+
+| Issue# | When | Who | What |
+| -----------|-------------|-----------|---------------|
+| N/A | Oct '17 | CNugteren | CUDA API for CLBlast |
+| #169, #195 | Oct-Nov '17 | CNugteren | Auto-tuning the kernel selection parameter |
+| #181, #201 | Nov '17 | CNugteren | Compilation for Android and testing on Qualcomm Adreno |
+| #128, #205 | Nov-Dec '17 | CNugteren | Pre-processor for loop unrolling and array-to-register-promotion for e.g. ARM Mali |
+| #169 | '17 | dividiti | Problem-specific tuning parameter selection |