summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorCedric Nugteren <web@cedricnugteren.nl>2017-12-10 16:08:06 +0100
committerCedric Nugteren <web@cedricnugteren.nl>2017-12-10 16:08:06 +0100
commit11489e68ef625d872a762b79e43426606a90edea (patch)
tree744ef07855e00aed8ce038a9923dd1cdd66dda4d
parent82467b64c4402100e01af99d00caf3bd89c9cde4 (diff)
Updated roadmap: completed pre-processor implementation
-rw-r--r--CHANGELOG2
-rw-r--r--ROADMAP.md21
2 files changed, 13 insertions, 10 deletions
diff --git a/CHANGELOG b/CHANGELOG
index ef16cd0d..0ff7856d 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -3,7 +3,7 @@ Development (next version)
- Re-designed and integrated the auto-tuner, no more dependency on CLTune
- Made it possible to override the tuning parameters in the clients straight from JSON tuning files
- Added OpenCL pre-processor to unroll loops and perform array-to-register promotions for compilers
- which don't this themselves (ARM, Qualcomm) - greatly improves performance on these platforms
+ which don't do this themselves (ARM, Qualcomm) - greatly improves performance on these platforms
- Added tuned parameters for various devices (see README)
Version 1.2.0
diff --git a/ROADMAP.md b/ROADMAP.md
index ad15d16c..18ac0bc5 100644
--- a/ROADMAP.md
+++ b/ROADMAP.md
@@ -3,12 +3,15 @@ CLBlast feature road-map
This file gives an overview of the main features planned for addition to CLBlast. A first-order indication time-frame for development time is provided:
-| Issue# | When | Who | Status | What |
-| -----------|-------------|-----------|--------|---------------|
-| - | Oct '17 | CNugteren | ✔ | CUDA API for CLBlast |
-| [#169](https://github.com/CNugteren/CLBlast/issues/169), [#195](https://github.com/CNugteren/CLBlast/issues/195) | Oct-Nov '17 | CNugteren | ✔ | Auto-tuning the kernel selection parameter |
-| [#181](https://github.com/CNugteren/CLBlast/issues/181), [#201](https://github.com/CNugteren/CLBlast/issues/201) | Nov '17 | CNugteren | ✔ | Compilation for Android and testing on a device |
-| - | Nov '17 | CNugteren | ✔ | Integration of CLTune for easy testing on Android / fewer dependencies |
-| [#128](https://github.com/CNugteren/CLBlast/issues/128), [#205](https://github.com/CNugteren/CLBlast/issues/205) | Nov-Dec '17 | CNugteren | | Pre-processor for loop unrolling and array-to-register-promotion for e.g. ARM Mali |
-| [#207](https://github.com/CNugteren/CLBlast/issues/207) | Dec '17 | CNugteren | | Tuning of the TRSM/TRSV routines |
-| [#169](https://github.com/CNugteren/CLBlast/issues/169) | '17 | dividiti | | Problem-specific tuning parameter selection |
+| Issue# | When | Who | Status | What |
+| ---------------------------------------------------------------|-------------|-----------|--------|---------------|
+| - | Oct '17 | CNugteren | ✔ | CUDA API for CLBlast |
+| [#169](https://github.com/CNugteren/CLBlast/issues/169) & #195 | Oct-Nov '17 | CNugteren | ✔ | Auto-tuning the kernel selection parameter |
+| [#181](https://github.com/CNugteren/CLBlast/issues/181) & #201 | Nov '17 | CNugteren | ✔ | Compilation for Android and testing on a device |
+| - | Nov '17 | CNugteren | ✔ | Integration of CLTune for easy testing on Android / fewer dependencies |
+| [#128](https://github.com/CNugteren/CLBlast/issues/128) & #205 | Nov-Dec '17 | CNugteren | ✔ | Pre-processor for loop unrolling and array-to-register-promotion for e.g. ARM Mali |
+| [#207](https://github.com/CNugteren/CLBlast/issues/207) | Dec '17 | CNugteren | | Tuning of the TRSM/TRSV routines |
+| [#195](https://github.com/CNugteren/CLBlast/issues/195) | Jan '18 | CNugteren | | Extra GEMM API with pre-allocated temporary buffer |
+| [#224](https://github.com/CNugteren/CLBlast/issues/224) | Jan-Feb '18 | CNugteren | | Implement Hadamard product (element-wise vector-vector product) |
+| [#223](https://github.com/CNugteren/CLBlast/issues/223) | Feb '18 | CNugteren | | Python OpenCL interface |
+| [#169](https://github.com/CNugteren/CLBlast/issues/169) | ?? | dividiti | | Problem-specific tuning parameter selection |