Title
of the Talk : GPU Acceleration
of Sparse Matrix Applications
This talk presents
our work in GPU-based heterogeneous high-performance parallel
computing for sparse multifrontal methods for QR factorization.
The method assembles and factorizes all frontal matrices on
the GPU, and does not transfer data between the GPU and CPU
in the intermediate steps.
Our prototype software
exceeds 80 GFlops for a large sparse matrices on the NVIDIA
Fermi GPU. This represents an 8x speedup over highly-optimized
multicore sparse QR on the CPU.
|