IC3: International Conference of Contemporary Computing

Sanjay Ranka,

University of Florida, USA

Title of the Talk : GPU Acceleration of Sparse Matrix Applications

This talk presents our work in GPU-based heterogeneous high-performance parallel computing for sparse multifrontal methods for QR factorization. The method assembles and factorizes all frontal matrices on the GPU, and does not transfer data between the GPU and CPU in the intermediate steps.

Our prototype software exceeds 80 GFlops for a large sparse matrices on the NVIDIA Fermi GPU. This represents an 8x speedup over highly-optimized multicore sparse QR on the CPU.