๐”– Bobbio Scriptorium
โœฆ   LIBER   โœฆ

[ACM Press the third international workshop - San Jose, California, USA (2011.06.08-2011.06.08)] Proceedings of the third international workshop on Large-scale system and application performance - LSAP '11 - Making a case for distributed file systems at Exascale

โœ Scribed by Raicu, Ioan; Foster, Ian T.; Beckman, Pete


Book ID
121203660
Publisher
ACM Press
Year
2011
Weight
815 KB
Category
Article
ISBN
1450307035

No coin nor oath required. For personal study only.

โœฆ Synopsis


Exascale computers will enable the unraveling of significant scientific mysteries. Predictions are that 2019 will be the year of exascale, with millions of compute nodes and billions of threads of execution. The current architecture of high-end computing systems is decades-old and has persisted as we scaled from gigascales to petascales. In this architecture, storage is completely segregated from the compute resources and are connected via a network interconnect. This approach will not scale several orders of magnitude in terms of concurrency and throughput, and will thus prevent the move from petascale to exascale. At exascale, basic functionality at high concurrency levels will suffer poor performance, and combined with system mean-time-to-failure in hours, will lead to a performance collapse for large-scale heroic applications. Storage has the potential to be the Achilles heel of exascale systems. We propose that future high-end computing systems be designed with non-volatile memory on every compute node, allowing every compute node to actively participate in the metadata and data management and leveraging many-core processors high bisection bandwidth in torus networks. This position paper discusses this revolutionary new distributed storage architecture that will make exascale computing more tractable, touching virtually all disciplines in high-end computing and fueling scientific discovery.


๐Ÿ“œ SIMILAR VOLUMES


[ACM Press the third international works
โœ Schnorr, Lucas Mello; Legrand, Arnaud; Vincent, Jean-Marc ๐Ÿ“‚ Article ๐Ÿ“… 2011 ๐Ÿ› ACM Press โš– 716 KB

Large scale distributed systems are composed of many thousands of computing units. Today's examples of such systems are grid, volunteer and cloud computing platforms. Generally, their analyses are done through monitoring tools that gather resource information like processor or network utilization, p