Age | Commit message (Collapse) | Author | |
---|---|---|---|
2016-09-22 | Fixed a bug waiting for an invalid event in case of a non-succesfull CLBlast ↵ | Cedric Nugteren | |
call in the tests and samples | |||
2016-09-21 | It is now possible to set the OpenCL compiler options through an ↵ | Cedric Nugteren | |
environmental variable | |||
2016-09-21 | Merge branch 'master' into development | Cedric Nugteren | |
2016-09-20 | Merge pull request #100 from gpu/master | Cedric Nugteren | |
Fixed link in README.md | |||
2016-09-20 | Fixed link in README.md | Marco Hutter | |
The GitHub link could be https://github.com/gpu (without "s"), but the website should be OK, too | |||
2016-09-13 | Merge pull request #99 from CNugteren/development | Cedric Nugteren | |
Update to version 0.9.0 | |||
2016-09-13 | Updated to version 0.9.0 | Cedric Nugteren | |
2016-09-13 | Renamed the DEFAULT_DEVICE and DEFAULT_PLATFORM env variables to be in line ↵ | Cedric Nugteren | |
with recent usages of CLBLAST_DEVICE and CLBLAST_PLATFORM | |||
2016-09-13 | Merge pull request #98 from intelfx/no-ignored-attributes | Cedric Nugteren | |
CMakeLists.txt: use -Wno-ignored-attributes to silence unfixable warnings | |||
2016-09-13 | CMakeLists.txt: use -Wno-ignored-attributes to silence unfixable warnings | Ivan Shapovalov | |
2016-09-12 | Split the XGEMM kernel further up: now in 3 parts. This is done because MSVC ↵ | Cedric Nugteren | |
can't handle long strings | |||
2016-09-12 | Merge branch 'database_rewrite' into development | Cedric Nugteren | |
2016-09-12 | Added XgemvFastRot and Xgemm 16-bit tuning results: just defaults which are ↵ | Cedric Nugteren | |
now automatically taken from 32-bit if there are no entries at all | |||
2016-09-11 | Complete re-write of the database script. Changed Pandas for the much faster ↵ | Cedric Nugteren | |
and convienient plain JSON/dict data-type | |||
2016-09-10 | Merge branch 'xgemm_tuner_exhaustive' into development | Cedric Nugteren | |
2016-09-10 | Updated database based on exhaustive tuning results for GEMM for the R9 ↵ | Cedric Nugteren | |
M370X GPU | |||
2016-09-10 | Updated the database script to remove duplicate entries: keeps only the ↵ | Cedric Nugteren | |
best-performing cases for a specific parameters combination | |||
2016-09-06 | Split GEMM tuning in two parts: a small set of tuning parameters which is ↵ | Cedric Nugteren | |
explored exhaustively and a larger set which is explored randomly | |||
2016-09-04 | Refactored the Python C++ generator script; now confirms to the PEP8 styleguide | Cedric Nugteren | |
2016-09-04 | The GEMM kernel no longer adds beta*C in case beta is zero; this would cause ↵ | Cedric Nugteren | |
problems if C contains NaNs | |||
2016-09-03 | Added tuning results for Intel Broadwell 5500 GT2 GPU | Cedric Nugteren | |
2016-09-03 | Updated tuning results for Haswell GT2 Mobile GPU; fixed database script to ↵ | Cedric Nugteren | |
handle duplicate entries of different runs | |||
2016-08-27 | Merge pull request #93 from intelfx/test-read-environment | Cedric Nugteren | |
test/correctness: read platform and device from environment | |||
2016-08-27 | test/correctness: read platform and device from environment | Ivan Shapovalov | |
Support passing environment variables CLBLAST_PLATFORM and CLBLAST_DEVICE instead of -platform and -device arguments to test executables. This is for `ctest`. | |||
2016-08-22 | Merge branch 'database_defaults' into development | Cedric Nugteren | |
2016-08-21 | Also changed the default-default for unknown device types to use the same ↵ | Cedric Nugteren | |
method as for known device groups | |||
2016-08-21 | Increased the ratio of GEMM tuning results to explore; reduced the tuning ↵ | Cedric Nugteren | |
search space to have a better chance to evaluate more likely parameter combinations | |||
2016-08-21 | Updated the changelog; refactored the database-get-bests code a bit | Cedric Nugteren | |
2016-08-20 | Merge branch 'development' of github.com:CNugteren/CLBlast into development | Cedric Nugteren | |
Conflicts: README.md | |||
2016-08-20 | Merge branch 'dvasschemacq-master' into development | Cedric Nugteren | |
2016-08-20 | Merge branch 'master' of https://github.com/dvasschemacq/CLBlast into ↵ | Cedric Nugteren | |
dvasschemacq-master Conflicts: src/kernels/level1/xaxpy.opencl src/kernels/level2/xgemv.opencl src/kernels/level2/xgemv_fast.opencl src/kernels/level2/xger.opencl src/kernels/level2/xher.opencl src/kernels/level2/xher2.opencl src/kernels/level3/xgemm_part2.opencl | |||
2016-08-18 | Adapt opencl files for 1.1 OpenCL | D. Van Assche | |
In OpenCL 1.1 __kernel has to be before __attribute__, at least with Vivante compiler. | |||
2016-08-15 | Updated the database script to calculate the relative best performance of ↵ | Cedric Nugteren | |
tuning results common for a device/vendor type | |||
2016-08-09 | Improved the speed of the new common-best defaults method for the database ↵ | Cedric Nugteren | |
generation | |||
2016-08-07 | Added a first version of the database's common-best default calculation | Cedric Nugteren | |
2016-07-28 | Minor update regarding the previous CMake export/install target changes | Cedric Nugteren | |
2016-07-28 | Merge pull request #86 from intelfx/cmake | Cedric Nugteren | |
CMakeLists.txt: provide a find_package() config for dependent projects | |||
2016-07-28 | .appveyor.yml: move {OPENCL,CLBLAST}_ROOT out of source tree | Ivan Shapovalov | |
Reasoning is the same as in previous commit: CMake does not like having OpenCL header path inside of the source tree. CLBLAST_ROOT is moved for uniformity. | |||
2016-07-28 | .travis.yml: use OpenCL ICD Loader and headers shipped by distro | Ivan Shapovalov | |
Using our own headers causes problems with CMake which does not like having OpenCL header path inside of the source tree. While at it, use distro's universal OpenCL loader as well. | |||
2016-07-28 | CMakeLists.txt: use target_include_directories() | Ivan Shapovalov | |
2016-07-28 | CMakeLists.txt: provide a find_package() config for dependent projects | Ivan Shapovalov | |
2016-07-26 | Merge branch 'gemv_performance' into development | Cedric Nugteren | |
2016-07-25 | Removed all old tuning results for the XgemvFastRot kernel; re-added for a ↵ | Cedric Nugteren | |
couple of devices | |||
2016-07-25 | Moved the XgemvFast and XgemvFastRot tuning database into a separate file | Cedric Nugteren | |
2016-07-24 | Merge branch 'development' into gemv_performance | Cedric Nugteren | |
2016-07-24 | Minor improvements after merging in groundwork for custom tuning parameters ↵ | Cedric Nugteren | |
and kernels | |||
2016-07-24 | Merge pull request #84 from intelfx/device-specific-kernels | Cedric Nugteren | |
Groundwork for device-specific routines | |||
2016-07-24 | Refactored the Python database script: separated functionality in modules, ↵ | Cedric Nugteren | |
now complies to the PEP8 style, added proper command-line argument parsing, and cleaned-up | |||
2016-07-23 | Fixe a bug in the new XgemvFastRot kernel related to local memory size | Cedric Nugteren | |
2016-07-23 | Further improvements to the XgemvFastRot kernel, properly enables coalescing now | Cedric Nugteren | |