GPU Accelearation of a 3D Finite Difference Earthquake Code on XSEDE Keeneland

Yifeng Cui

Published 2012, SCEC Contribution #1758

We have developed a highly scalable 3D Finite Difference GPU code for use in earthquake engineering and disaster management through regional petascale earthquake simulations. This MPI-CUDA code is based on a widely-used wave propagation code called AWP-ODC and restructured for high throughput and efficiency on a heterogeneous computing architecture. We present an effective communication reduction technique for leveraging GPUs with minimal PCI-e overhead, and a novel overlapping method to fully hide data communication latency between GPUs. The optimization concept used in this work can be extended to general stencil computing on a structured grid. The benchmarks demonstrated sustained 100 TFlops in single precision for 49 billion mesh points using 952 GPUs on the NCCS Titan Phase 5 system, which is a 77-fold speedup compared to the CPU version of the code. This multi-GPU implementation has been validated and used for a large-scale verification wave propagation simulation of Mw5.4 Chino Hills earthquake using 128 GPUs.

Citation
Cui, Y. (2012). GPU Accelearation of a 3D Finite Difference Earthquake Code on XSEDE Keeneland. Oral Presentation at XSEDE Conference 2012.