summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-12-20Added tuning results for Apple AMD Radeon Pro 580Cedric Nugteren
2017-12-20Added try-except to database script parser to skip invalid filesCedric Nugteren
2017-12-17Fixed an issue with the tuner: it was using platform vendor rather than ↵Cedric Nugteren
device vendor
2017-12-17Merge pull request #230 from CNugteren/kernel_preprocessorCedric Nugteren
Added an OpenCL kernel preprocessor
2017-12-17Removed all ARM Mali tuning results; re-added Mali-T760 and Mali-T628 ↵Cedric Nugteren
results based on kernel pre-processor
2017-12-17Fixed an unnecessary overflow issue on 32-bit systemsCedric Nugteren
2017-12-16Updated the known issuesCedric Nugteren
2017-12-10Fixed for error C1091 in MSVC 2013Cedric Nugteren
2017-12-10Split GEMM kernel in 4 files instead of 3 due to MSVC 2013 string length limitCedric Nugteren
2017-12-10Updated roadmap: completed pre-processor implementationCedric Nugteren
2017-12-10Fixed a missing includeCedric Nugteren
2017-12-10Fixed an issue in the tuners to prevent error -14 from persisting ↵Cedric Nugteren
(CL_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST)
2017-12-10Fixed an Android compilation issueCedric Nugteren
2017-12-09Completed kernel modifications for pre-processor of all other kernelsCedric Nugteren
2017-12-09Made the pre-processor run by default for ARM and Qualcomm GPUsCedric Nugteren
2017-12-09Modified the direct GEMM kernel to support array-to-register promotionCedric Nugteren
2017-12-09Reformatted GEMM kernel to support array-to-register promotionCedric Nugteren
2017-12-09Fixed defines parsing and substituting in pre-processor; fixed some variable ↵Cedric Nugteren
names in kernels
2017-12-07Added register promotion to the main GEMM kernelCedric Nugteren
2017-12-05Improved array-to-register promotion, now handling function calls as wellCedric Nugteren
2017-12-03Added GEMM (direct and in-direct) to the pre-processor testing; modified the ↵Cedric Nugteren
loops in kernel accordingly
2017-12-03Added basic bracket parsing in defines and loop expressionsCedric Nugteren
2017-12-03Reformated transpose kernels for the pre-processor; extended the amount of testsCedric Nugteren
2017-12-03Improved array to register promotion in the pre-processorCedric Nugteren
2017-11-30Improved the pre-processor's handling of defines; added a special nested ↵Cedric Nugteren
defines test
2017-11-30Integrated pre-processor in compilation flow, default is still disabledCedric Nugteren
2017-11-29Reformatted unrollable kernel loops and added the new promote_to_registers ↵Cedric Nugteren
pragma for several kernels
2017-11-29Extended the preprocessor tests to include CopyFast and CopyPadCedric Nugteren
2017-11-29Improves the array-to-register promotion in the pre-processorCedric Nugteren
2017-11-28Improved the pre-processor tester, added GEMV and GER kernelsCedric Nugteren
2017-11-28Improved the kernel pre-processor in various waysCedric Nugteren
2017-11-27Added simple implementation of array-to-register promotionCedric Nugteren
2017-11-26Improved the for-loop pre-processingCedric Nugteren
2017-11-25Implemented first simple pre-processor: defines parser and loop unrolling ↵Cedric Nugteren
based on assumptions
2017-11-25Moved string splitting functions; added string character removal functionCedric Nugteren
2017-11-25Added stub for a preprocessor and a corresponding compilation testCedric Nugteren
2017-11-25Merge pull request #222 from CNugteren/override_params_from_jsonCedric Nugteren
Override params in clients from tuner JSON
2017-11-24Fixed a Clang compilation errorCedric Nugteren
2017-11-24Added tuning results for ARM Mali T760 GPUCedric Nugteren
2017-11-24Added missing include fileCedric Nugteren
2017-11-24Added precision check to parameter override for the clientsCedric Nugteren
2017-11-22Made parameter override in the clients a command-line argument and added ↵Cedric Nugteren
support for multi-kernel routines
2017-11-21Implemented first version of reading JSON files from disk in the client to ↵Cedric Nugteren
override parameters
2017-11-20Made the database script properly handle multiple entries for a single deviceCedric Nugteren
2017-11-20Potentially fixed an MSVC 2013 issue with a copy-constructor not being generatedCedric Nugteren
2017-11-20Fixes some displaying issues in the GEMM routine tunerCedric Nugteren
2017-11-19Fixed a variety of warnings and an error for MSVC2013 compilationCedric Nugteren
2017-11-19Merge pull request #216 from CNugteren/integrated_tunerCedric Nugteren
Integrated tuner
2017-11-19Minor fix to the database scriptCedric Nugteren
2017-11-19Added compilation timing and better compilation error reportingCedric Nugteren