[ACM Press State of the Practice Reports - Seattle, Washington (2011.11.12-2011.11.18)] State of the Practice Reports on - SC '11 - Best practices for the deployment and management of production HPC clusters
β Scribed by McLay, Robert; Schulz, Karl W.; Barth, William L.; Minyard, Tommy
- Book ID
- 121791501
- Publisher
- ACM Press
- Year
- 2011
- Weight
- 282 KB
- Category
- Article
- ISBN
- 1450311393
No coin nor oath required. For personal study only.
β¦ Synopsis
Commodity-based Linux HPC clusters dominate the scientific computing landscape in both academia and industry ranging from small research clusters to petascale supercomputers supporting thousands of users. To support broad user communities and manage a user-friendly environment, end-user sites must combine a range of low-level system software with multiple compiler chains, support libraries, and a suite of 3rd party applications. In addition, large systems require bare metal provisioning and a flexible software management strategy to maintain consistency and upgradeability across thousands of compute nodes. This report documents a Linux operating system framework, (LosF), which has evolved over the last seven years to provide an integrated strategy for the deployment of multiple HPC systems at the Texas Advanced Computing Center. Documented within this effort is the high-level cluster configuration options and definitions, bare-metal provisioning, hierarchical HPC software stack design, package-management, user environment management tools, user account synchronization, and local customization configurations.
π SIMILAR VOLUMES