Algorithm 1 (Compiling Align directives) Input: Fortran 90D/HPF syntax tree with some alignment functions to template Output: Fortran 90D/HPF syntax tree with identical alignment functions to template Method: For each aligned array, and for each dimension of that array, carry out the following ste
Compiling High Performance Fortran for distributed-memory architectures
โ Scribed by Siegfried Benkner; Hans Zima
- Publisher
- Elsevier Science
- Year
- 1999
- Tongue
- English
- Weight
- 509 KB
- Volume
- 25
- Category
- Article
- ISSN
- 0167-8191
No coin nor oath required. For personal study only.
โฆ Synopsis
High Performance Fortran (HPF) is a data-parallel language that provides a high-level interface for programming scientiยฎc applications, while delegating to the compiler the task of generating explicitly parallel message-passing programs. This paper provides an overview of HPF compilation and runtime technology for distributed-memory architectures, and deals with a number of topics in some detail. In particular, we discuss distribution and alignment processing, the basic compilation scheme and methods for the optimization of regular computations. A separate section is devoted to the transformation and optimization of independent loops with irregular data accesses. The paper concludes with a discussion of research issues and outlines potential future development paths of the language.
๐ SIMILAR VOLUMES
We describe a new approach to programming distributed-memory computers. Rather than having each node in the system explicitly programmed, we derive an efficient message-passing program from a sequential shared-memory program annotated with directions on how elements of shared arrays are distributed
data-parallelism in these languages. Array expressions involve array sections which consist of array elements from a lower index to an upper index at a fixed stride. In order to generate high-performance target code, compilers for distributed-memory machines should produce efficient code for array s
We present three parallel implementations of the Karatsuba algorithm for long integer multiplication on a distributed memory architecture and discuss the experimental results obtained on a Paragon computer. The first two implementations have both time complexity O(n) on n log 2 3 processors, but pre
## Abstract The rapid rise of OpenMP as the preferred parallel programming paradigm for smallโtoโmedium scale parallelism could slow unless OpenMP can show capabilities for becoming the modelโofโchoice for large scale highโperformance parallel computing in the coming decade. The main stumbling blo