𝔖 Bobbio Scriptorium
✦   LIBER   ✦

369 Tflop/s molecular dynamics simulations on the petaflop hybrid supercomputer ‘Roadrunner’

✍ Scribed by Timothy C. Germann; Kai Kadau; Sriram Swaminarayan


Publisher
John Wiley and Sons
Year
2009
Tongue
English
Weight
305 KB
Volume
21
Category
Article
ISSN
1532-0626

No coin nor oath required. For personal study only.

✦ Synopsis


Abstract

We describe the implementation of a short‐range parallel molecular dynamics (MD) code, SPaSM, on the heterogeneous general‐purpose Roadrunner supercomputer. Each Roadrunner ‘TriBlade’ compute node consists of two AMD Opteron dual‐core microprocessors and four IBM PowerXCell 8i enhanced Cell microprocessors (each consisting of one PPU and eight SPU cores), so that there are four MPI ranks per node, each with one Opteron and one Cell. We will briefly describe the Roadrunner architecture and some of the initial hybrid programming approaches that have been taken, focusing on the SPaSM application as a case study. An initial ‘evolutionary’ port, in which the existing legacy code runs with minor modifications on the Opterons and the Cells are only used to compute interatomic forces, achieves roughly a 2× speedup over the unaccelerated code. On the other hand, our ‘revolutionary’ implementation adopts a Cell‐centric view, with data structures optimized for, and living on, the Cells. The Opterons are mainly used to direct inter‐rank communication and perform I/O‐heavy periodic analysis, visualization, and checkpointing tasks. The performance measured for our initial implementation of a standard Lennard–Jones pair potential benchmark reached a peak of 369 Tflop/s double‐precision floating‐point performance on the full Roadrunner system (27.7% of peak), nearly 10× faster than the unaccelerated (Opteron‐only) version. Copyright © 2009 John Wiley & Sons, Ltd.