Distributed-memory message-passing machines deliver scalable performance but are difficult to program. Shared-memory machines, on the other hand, are easier to program but obtaining scalable performance with large number of processors is difficult. Recently, scalable machines based on logically shar
Compilation and Communication Strategies for Out-of-Core Programs on Distributed Memory Machines
โ Scribed by Rajesh Bordawekar; Alok Choudhary; J. Ramanujam
- Publisher
- Elsevier Science
- Year
- 1996
- Tongue
- English
- Weight
- 518 KB
- Volume
- 38
- Category
- Article
- ISSN
- 0743-7315
No coin nor oath required. For personal study only.
โฆ Synopsis
program. The need for high performance I/O is so significant that almost all the present generation parallel computers such as the Paragon, SP-2, and nCUBE2. provide some kind of hardware and software support for parallel I/O [dRC94].
Data parallel languages such as High Performance Fortran (HPF) [Hig93] were designed for developing complex scientific applications on parallel machines. For these languages to be used for programming large applications, it is essential that they (and their compilers) provide support for applications requiring large data sets. As part of the ongoing PASSION 5 [CBH ฯฉ 94] project, we are currently modifying the Portland Group HPF [MMWY95] compiler to support out-of-core applications [Bor96]. The PAS-SION compiler takes an HPF program accessing out-ofcore arrays as an input and produces the corresponding node program with calls to runtime routines for I/O and communication. The compiler strip-mines the computation so that only the data which is in memory is operated on, and handles the required buffering. Computation on inmemory data often requires data which is not present in a processor's memory requiring local I/O access as well as communication among processors for access to nonlocal data. Since the data is stored in files, communication often results in file accesses. The file access costs are normally three orders of magnitude greater than the interprocessor communication costs. Therefore, it is very important to reduce disk access costs as much as possible. In this paper, we propose three strategies to perform communication when primary data is stored on files and illustrate them using two scientific applications.
The paper is organized as follows. Section 2 describes the out-of-core computation model. This section also introduces a data storage model called the local placement model. Section 3 describes the three proposed strategies for performing communication in out-of-core data parallel problems. A running example of a 2-D elliptic solver using Jacobi relaxation is used to illustrate these strategies. Section 4 presents experimental performance results for the three communication strategies using two out-of-core ap-
๐ SIMILAR VOLUMES