-
Notifications
You must be signed in to change notification settings - Fork 104
Spmv
Description: Measures performance for sparse matrix vector multiplication using several different algorithms and data structures. The default randomly generated matrices are square and have a sparsity of 1 percent. Alternatively, a matrix market file can be loaded using the --mm_filename example.mm
argument.
Problem Sizes: (NxN Matrix) - 1024, 8192, 12288, 16384
Precision: Both
Includes PCIe Transfer Time: in [testName]_PCIe measurements
SpMV uses three kernels. The first two are based on the compressed sparse row (CSR) data structure. The first kernel assigns one thread or local work group item to each row of the matrix. The second kernel takes the same approach, except it assigns a full warp or small group of threads to handle each row. These kernels are tested on both normal and padded data. The third kernel uses the recently proposed ELLPACKR data structure. For more information on sparse matrix vector multiplication on GPUs, see the excellent paper by M. Garland.
Specific Tests (All report GFLOPS)
- CSR-Scalar - performance of the single thread per row CSR kernel
- CSR-Vector - performance of the warp/vector of threads per row CSR kernel
- ELLPACKR - performance of the kernel using the ELLPACKR data structure