summaryrefslogtreecommitdiff
AgeCommit message (Expand)Author
2016-10-18Fixed compilation issues of the testers for Visual Studio 2013: mostly conver...Cedric Nugteren
2016-10-16Merge branch 'development' into netlib_blas_apiCedric Nugteren
2016-10-15Added documentation and minor refactoring for the recent support of static li...Cedric Nugteren
2016-10-15Merge pull request #115 from shehzan10/developmentCedric Nugteren
2016-10-14Fixes for static lib compilation on WindowsShehzan Mohammed
2016-10-14Fixed a bug where clblas.h couldn't be found for the performance tests (clients)Cedric Nugteren
2016-10-14Fixed an issue with a growing database: the database is now a global variable...Cedric Nugteren
2016-10-14Set proper flags for the verbose mode (debug flags)Cedric Nugteren
2016-10-14Merge pull request #112 from shehzan10/staticCedric Nugteren
2016-10-13Add option to build shared or static libraryShehzan Mohammed
2016-10-13Added tuning results for Intel HD Graphics IvyBridge GPUCedric Nugteren
2016-10-13Merge pull request #108 from CNugteren/msvc2013Cedric Nugteren
2016-10-12Removed a spurious #ifdefCedric Nugteren
2016-10-12Fixed missing line endingCedric Nugteren
2016-10-10Added support for compiling the library, the client, and the samples under MS...Cedric Nugteren
2016-10-10Fixed an issue with const members of structs in the databaseCedric Nugteren
2016-10-10Fixed an issue with the length of the GEMM OpenCL string for both MSVC 2013 a...Cedric Nugteren
2016-10-10First fixes towards compilation on Visual Studio 2013Cedric Nugteren
2016-10-10Changed the storage location of the database to a separate Github repositoryCedric Nugteren
2016-10-10Changed the license to MITCedric Nugteren
2016-10-10Updated the performance graphs for Intel Iris Pro GPU and AMD Radeon M370X GPUCedric Nugteren
2016-10-10Added fresh performance graphs for GeForce 750Ti; removed old GTX480 resultsCedric Nugteren
2016-10-10Updated the tuning results for the GTX 750 Ti GPUCedric Nugteren
2016-10-10Merge branch 'gemm_direct' into developmentCedric Nugteren
2016-10-10Changed the thresholds for the direct/indirect GEMM kernels for NVIDIA and In...Cedric Nugteren
2016-10-08Added benchmark script for small matrix sizes, testing the direct GEMM kernelsCedric Nugteren
2016-10-08Fixed a performance bug for Intel Iris Pro GPUs due to incorrect tuning resultsCedric Nugteren
2016-10-06Added first tuning results for the single-kernel direct GEMM implementationCedric Nugteren
2016-10-06Added a kernel selection database to select between the direct and indirect G...Cedric Nugteren
2016-10-05Made non-standard types void-pointers in the Netlib BLAS interfaceCedric Nugteren
2016-10-05Added first version of Netlib BLAS API headerCedric Nugteren
2016-10-03Fixed a const-correctness issue with complex conjugation in the GEMM direct k...Cedric Nugteren
2016-10-03Added functions to load from off-chip to local memory without vector loads fo...Cedric Nugteren
2016-10-03Re-organised GEMM direct kernel and added faster fall-back version for incomp...Cedric Nugteren
2016-10-02Set the default number of runs for all kernels to at least 2 runsCedric Nugteren
2016-10-02Specialised the GEMM direct kernel in four ways for transposing/non-transposi...Cedric Nugteren
2016-10-02Split the GEMM direct kernel into two files; set the default tuning target to...Cedric Nugteren
2016-10-01Added padding to the local memory of the GEMM direct kernelCedric Nugteren
2016-10-01Added default num-runs to the tuner adding averaging over 10 runs as a defaul...Cedric Nugteren
2016-10-01Merge branch 'development' into gemm_directCedric Nugteren
2016-09-27Added an option to run tuned kernels multiple times to average execution time...Cedric Nugteren
2016-09-27Updated to version 8.0 of the CLCudaAPI headerCedric Nugteren
2016-09-27Fixed the local memory size computation for the GEMM tunersCedric Nugteren
2016-09-27Now generates test/client/tuner data using a fixed seed to enable reproducabi...Cedric Nugteren
2016-09-27Added more relaxed error checking for the half-precision testsCedric Nugteren
2016-09-27Merge pull request #103 from dividiti/link_clblas_with_pthreadCedric Nugteren
2016-09-26Use cross-platform thread lib idiom instead of *nix-specific pthread.Anton Lokhmotov
2016-09-26Link clBLAS together with pthread.Anton Lokhmotov
2016-09-25Added a first version of a tuner for the GEMM direct kernel; collapsed MWGD, ...Cedric Nugteren
2016-09-25Separated the tuning parameters of the new direct GEMM kernel from the indire...Cedric Nugteren