𝔖 Bobbio Scriptorium
✦   LIBER   ✦

Evaluation of Collective I/O Implementations on Parallel Architectures

✍ Scribed by Phillip M. Dickens; Rajeev Thakur


Publisher
Elsevier Science
Year
2001
Tongue
English
Weight
269 KB
Volume
61
Category
Article
ISSN
0743-7315

No coin nor oath required. For personal study only.

✦ Synopsis


In this paper, we evaluate the impact on performance of various implementation techniques for collective IΓ‚O operations, and we do so across four important parallel architectures. We show that a naive implementation of collective IΓ‚0 does not result in significant performance gains for any of the architectures, but that an optimized implementation does provide excellent performance across all of the platforms under study. Furthermore, we demonstrate that there exists a single implementation strategy that provides the best performance for all four computational platforms. Next, we evaluate implementation techniques for thread-based collective IΓ‚O operations. We show that the most obvious implementation technique, which is to spawn a thread to execute the whole collective IΓ‚O operation in the background, frequently provides the worst performance, often performing much worse than just executing the collective IΓ‚O routine entirely in the foreground. To improve performance, we explore an alternate approach where part of the collective IΓ‚O operation is performed in the background, and part is performed in the foreground. We demonstrate that this implementation technique can provide significant performance gains, offering up to a 50 0 improvement over implementations that do not attempt to overlap collective IΓ‚O and computation.


πŸ“œ SIMILAR VOLUMES


Design and Implementation of a Parallel
✍ Jaechun No; Sung-soon Park; Jesus Carretero Perez; Alok Choudhary πŸ“‚ Article πŸ“… 2002 πŸ› Elsevier Science 🌐 English βš– 443 KB

We present the design, implementation, and evaluation of a runtime system based on collective I/O techniques for irregular applications. The design is motivated by the requirements of a large number of science and engineering applications including teraflops applications, where the data must be reor

Implementation of a Numerical Solution o
✍ Tamir G. Reisin; Sabine C. Wurzler πŸ“‚ Article πŸ“… 2001 πŸ› Elsevier Science 🌐 English βš– 779 KB

Two different numerical solutions of the two-component kinetic collection equation were implemented on parallel computers. The parallelization approach included domain decomposition and MPI commands for communications. Four different parallel codes were tested. A dynamic decomposition based on an oc