summaryrefslogtreecommitdiff
path: root/test/performance
AgeCommit message (Expand)Author
2023-05-07AMAX/AMIN integer testing and bug fixes (#457)Cedric Nugteren
2018-12-17Fix half-float+kernel_mode test cases of im2col, col2im, and convgemmKoichi Akabe
2018-10-23Added groundwork for col2im algorithm plus first non-working version of kerne...Cedric Nugteren
2018-07-29Removed complex numbers support for CONVGEMMCedric Nugteren
2018-06-03Merge branch 'master' into CLBlast-267-convgemmCedric Nugteren
2018-05-27Added maximum time reporting to the client statisticsCedric Nugteren
2018-05-23Added an option in the clients to output timing statistics: minimum, mean, an...Cedric Nugteren
2018-05-09Fixed the performance client for convgemm and added GFLOPS measurementsCedric Nugteren
2018-05-06Added convgemm skeleton, test infrastructure, and first reference implementationCedric Nugteren
2018-01-31Created the API and stubs for the HAD (hadamard-product) routinesCedric Nugteren
2018-01-14Small improvements to benchmarking for cuBLASCedric Nugteren
2018-01-07Added API and tests for new GemmStridedBatched routineCedric Nugteren
2018-01-03Added a queue argument to the get-size function when running the tests/clientsCedric Nugteren
2017-11-22Made parameter override in the clients a command-line argument and added supp...Cedric Nugteren
2017-11-21Implemented first version of reading JSON files from disk in the client to ov...Cedric Nugteren
2017-10-15Prepared test and client infrastructure for use with the CUDA APICedric Nugteren
2017-10-01GEMM tests now test both the in-direct and the direct kernels seperatelyCedric Nugteren
2017-08-23Made the im2col client properly handle the argumentsCedric Nugteren
2017-08-12Merge branch 'master' into im_to_colCedric Nugteren
2017-08-12Moved some utility functions to a test-specific utility compilation-unitCedric Nugteren
2017-07-16First step towards supporting im2col in the test infrastructureCedric Nugteren
2017-04-17Fixed a namespace clash with CUDA FP16 for the half-datatypeCedric Nugteren
2017-04-13Fixed CUDA malloc and cuBLAS handles: cuBLAS as a performance-reference now w...Cedric Nugteren
2017-04-03In-lined the float2 and double2 types to avoid collision with CUDA's definitionsCedric Nugteren
2017-04-02Layed the groundwork for cuBLAS comparisons in the clientsCedric Nugteren
2017-04-01Separated host-device and device-host memory copies from execution of the CBL...Cedric Nugteren
2017-03-19Fixed a compilation issue for GCC/MSVCCedric Nugteren
2017-03-12Fixed a linker issue for ClangCedric Nugteren
2017-03-10Added API and test infrastructure for the batched GEMM routineCedric Nugteren
2017-03-08Make batched routines based on offsets instead of a vector of cl_mem objects ...Cedric Nugteren
2017-03-05Minor fixes to the client w.r.t. the addition of the batch countCedric Nugteren
2017-03-05Adjusted the test-infrastructure to support testing of batched-versions of ro...Cedric Nugteren
2017-03-05Changed the way the test-data is generated: now using a single MT generator a...Cedric Nugteren
2017-03-05Prepared generator for batched routines; added batched AXPY routine interfaceCedric Nugteren
2017-02-26Removed half-precision support from the TRSM routine; too unstableCedric Nugteren
2017-02-05Merge branch 'development' into triangular_solversCedric Nugteren
2017-01-20treewide: include clpp11.hpp first to silence deprecation warningsIvan Shapovalov
2017-01-15Added a first version of the diagonal block invert routine in preparation of ...Cedric Nugteren
2016-11-27Made it possible to use the command-line environmental variables for each exe...Cedric Nugteren
2016-10-22Moved files around a bit; created a utilities subfolderCedric Nugteren
2016-09-27Now generates test/client/tuner data using a fixed seed to enable reproducabi...Cedric Nugteren
2016-07-06Added an option to the performance clients to do a warm-up run before timingCedric Nugteren
2016-06-27Moved the performance graph scripts to the 'scripts' subfolderCedric Nugteren
2016-06-19Renamed all C++ source files to .cpp to match the .hpp extension betterCedric Nugteren
2016-06-18Moved all headers into the source tree, changed headers to .hpp extensionCedric Nugteren
2016-06-16Added XOMATCOPY routines to perform out-of-place matrix scaling, copying, and...Cedric Nugteren
2016-05-25Added possibility to run the performance client with half-precisionCedric Nugteren
2016-05-18Merged in latest changes from 0.7.1 releaseCedric Nugteren
2016-04-20Added support for the iSAMAX/iDAMAX/iCAMAX/iZAMAX routinescnugteren
2016-04-20Added prototype for ixAMAX routinescnugteren