index
:
debian-clblast
debian/sid
upstream/latest
Debian package for CLBlast.
gspr@nonempty.org
summary
refs
log
tree
commit
diff
log msg
author
committer
range
path:
root
/
test
/
routines
Age
Commit message (
Collapse
)
Author
2021-03-13
set the correct flop count for xgemm
JishinMaster
2018-12-17
Fix half-float+kernel_mode test cases of im2col, col2im, and convgemm
Koichi Akabe
2018-11-12
Add kernel_mode option to im2col, col2im, and convgemm functions
Koichi Akabe
2018-11-07
Changed col2im to append to the existing im-buffer
Cedric Nugteren
2018-11-01
Fixed half-precision tests for im2col and col2im
Cedric Nugteren
2018-10-30
Fix col2im implementation
Koichi Akabe
2018-10-23
Added groundwork for col2im algorithm plus first non-working version of ↵
Cedric Nugteren
kernel and test
2018-10-22
Some name changes in im2col code
Cedric Nugteren
2018-09-16
Merge branch 'master' into convgemm_multi_kernel
Cedric Nugteren
2018-09-07
Added reference implementation for xCONVGEMM for half-precision
Cedric Nugteren
2018-07-31
Fixed the tests of OMATCOPY to include proper complex conjugation
Cedric Nugteren
2018-05-19
Merge branch 'master' into CLBlast-267-convgemm
Cedric Nugteren
2018-05-17
Fixed a few issues with canary region testing
Cedric Nugteren
2018-05-09
Fixed the performance client for convgemm and added GFLOPS measurements
Cedric Nugteren
2018-05-06
Added convgemm skeleton, test infrastructure, and first reference implementation
Cedric Nugteren
2018-02-02
Implemented the XHAD Hadamard product routine
Cedric Nugteren
2018-01-31
Created the API and stubs for the HAD (hadamard-product) routines
Cedric Nugteren
2018-01-07
Added API and tests for new GemmStridedBatched routine
Cedric Nugteren
2018-01-06
Prevented half-precision batched routines from failing in the tests
Cedric Nugteren
2018-01-06
Added CUDA interface to get temporary-buffer size for GEMM routine
Cedric Nugteren
2018-01-03
Added the temp-buffer to the GEMM testers and clients
Cedric Nugteren
2018-01-03
Added a queue argument to the get-size function when running the tests/clients
Cedric Nugteren
2017-12-23
Fixed unused variable warnings showing up with Clang
Cedric Nugteren
2017-11-19
Fixed a variety of warnings and an error for MSVC2013 compilation
Cedric Nugteren
2017-11-08
Fixed an FP16 issue in the homatcopy test; added a comment about improper ↵
Cedric Nugteren
testing of integer returning functions for FP16
2017-11-02
Integrated the GEMM routine tuner for kernel selection; added first tuning ↵
Cedric Nugteren
results
2017-10-25
Fixed small bug in (unused) invert tester
Cedric Nugteren
2017-10-15
Fixed a small copy-paste typo
Cedric Nugteren
2017-10-15
Modified test interfaces such that they support either OpenCL or CUDA
Cedric Nugteren
2017-10-15
Fixes for the CUDA API: first tests pass and the client runs
Cedric Nugteren
2017-10-15
Prepared test and client infrastructure for use with the CUDA API
Cedric Nugteren
2017-10-01
GEMM tests now test both the in-direct and the direct kernels seperately
Cedric Nugteren
2017-08-31
Fixed a bug in im2col confusing first and second workgroup size; made im2col ↵
Cedric Nugteren
kernel 2d instead of 3d
2017-08-23
Made the im2col client properly handle the arguments
Cedric Nugteren
2017-08-19
Implemented proper im2col reference function and completd tests
Cedric Nugteren
2017-08-12
Merge branch 'master' into im_to_col
Cedric Nugteren
2017-08-12
Moved some utility functions to a test-specific utility compilation-unit
Cedric Nugteren
2017-07-16
First step towards supporting im2col in the test infrastructure
Cedric Nugteren
2017-07-12
Relaxed requirement on a_ld and b_ld for batched GEMM
Cedric Nugteren
2017-06-26
Fixed and suppresses several warnings for MSVC
Cedric Nugteren
2017-05-11
Bug-fix in the half-precision test of the amax routine
Cedric Nugteren
2017-04-23
Fixed a compiler warning message
Cedric Nugteren
2017-04-13
Fixed CUDA malloc and cuBLAS handles: cuBLAS as a performance-reference now ↵
Cedric Nugteren
works
2017-04-11
Made compilation of the cuBLAS wrapper work properly
Cedric Nugteren
2017-04-10
Added reference implementations for performance-testing against cuBLAS
Cedric Nugteren
2017-04-03
Fixes the CUDA wrapper (now actually tested on a system with CUDA)
Cedric Nugteren
2017-04-02
Factored out inclusion of clBLAS and CBLAS from the test-routine files
Cedric Nugteren
2017-04-02
Factored out inclusion of clBLAS and CBLAS from the test-routine files
Cedric Nugteren
2017-04-01
Separated host-device and device-host memory copies from execution of the ↵
Cedric Nugteren
CBLAS reference code; for fair timing and code de-duplication
2017-03-10
Added API and test infrastructure for the batched GEMM routine
Cedric Nugteren
[next]