diff options
-rw-r--r-- | README.md | 2 | ||||
-rw-r--r-- | ROADMAP.md | 12 |
2 files changed, 13 insertions, 1 deletions
@@ -10,7 +10,7 @@ CLBlast: The tuned OpenCL BLAS library CLBlast is a modern, lightweight, performant and tunable OpenCL BLAS library written in C++11. It is designed to leverage the full performance potential of a wide variety of OpenCL devices from different vendors, including desktop and laptop GPUs, embedded GPUs, and other accelerators. CLBlast implements BLAS routines: basic linear algebra subprograms operating on vectors and matrices. See [the CLBlast website](https://cnugteren.github.io/clblast) for performance reports on various devices as well as the latest CLBlast news. -The library is not tuned for all possible OpenCL devices: __if out-of-the-box performance is poor, please run the tuners first__. See below for a list of already tuned devices and instructions on how to tune yourself and contribute to future releases of the CLBlast library. +The library is not tuned for all possible OpenCL devices: __if out-of-the-box performance is poor, please run the tuners first__. See below for a list of already tuned devices and instructions on how to tune yourself and contribute to future releases of the CLBlast library. See also the [CLBlast feature roadmap](ROADMAP.md) to get an indication of the future of CLBlast. Why CLBlast and not clBLAS or cuBLAS? diff --git a/ROADMAP.md b/ROADMAP.md new file mode 100644 index 00000000..07fb1ed2 --- /dev/null +++ b/ROADMAP.md @@ -0,0 +1,12 @@ +CLBlast feature road-map +================ + +This file gives an overview of the main features planned for addition to CLBlast. A first-order indication time-frame for development time is provided: + +| Issue# | When | Who | What | +| -----------|-------------|-----------|---------------| +| N/A | Oct '17 | CNugteren | CUDA API for CLBlast | +| #169, #195 | Oct-Nov '17 | CNugteren | Auto-tuning the kernel selection parameter | +| #181, #201 | Nov '17 | CNugteren | Compilation for Android and testing on Qualcomm Adreno | +| #128, #205 | Nov-Dec '17 | CNugteren | Pre-processor for loop unrolling and array-to-register-promotion for e.g. ARM Mali | +| #169 | '17 | dividiti | Problem-specific tuning parameter selection | |