✦ LIBER ✦

Communication Optimizations for Parallel C Programs

✍ Scribed by Yingchun Zhu; Laurie J. Hendren

Publisher: Elsevier Science
Year: 1999
Tongue: English
Weight: 661 KB
Volume: 58
Category: Article
ISSN: 0743-7315
DOI: 10.1006/jpdc.1999.1554

No coin nor oath required. For personal study only.

✦ Synopsis

This paper presents algorithms for reducing the communication overhead for parallel C programs that use dynamically allocated data structures. The framework consists of an analysis phase called possible-placement analysis, and a transformation phase called communication selection. The fundamental idea of possible-placement analysis is to find all possible points for insertion of remote memory operations. Remote reads are propagated upwards, whereas remote writes are propagated downwards. Based on the results of the possibleplacement analysis, the communication selection transformation selects the ``best'' place for inserting the communication and determines if pipelining or blocking of communication should be performed. The framework has been implemented in the EARTH-McCAT optimizing C compiler, and experimental results are presented for five pointer-intensive benchmarks running on the EARTH-MANNA distributed-memory parallel processor. These experiments show that the communication optimization can provide performance improvements of up to 160 over the unoptimized benchmarks.

📜 SIMILAR VOLUMES

Compile-Time Estimation of Communication

Compile-Time Estimation of Communication Costs for Data Parallel Programs

✍ Thomas Fahringer 📂 Article 📅 1996 🏛 Elsevier Science 🌐 English ⚖ 416 KB

A key measure of the performance of a distributed memory parallel program is the communication overhead. On most current parallel systems, sending data from a local to a remote processor still takes one or two orders of magnitude longer than the time to access data on a local processor. The behavior

Generating Local Addresses and Communica

Generating Local Addresses and Communication Sets for Data-Parallel Programs

✍ S. Chatterjee; J.R. Gilbert; F.J.E. Long; R. Schreiber; S.H. Teng 📂 Article 📅 1995 🏛 Elsevier Science 🌐 English ⚖ 988 KB

Generating local addresses and communication sets is an important issue in distributed-memory implementations of data-parallel languages such as High Performance Fortran. We demonstrate a storage scheme for an array \(A\) affinely aligned to a template that is distributed across \(p\) processors wit

Asynchronous parallel evolutionary algor

Asynchronous parallel evolutionary algorithms for constrained optimizations

✍ Kang Li-shan; Liu Pu; Kang Zhuo; Li Yan; Chen Yu-ping 📂 Article 📅 2000 🏛 Wuhan University 🌐 English ⚖ 495 KB

Analyses and Optimizations for Shared Ad

Analyses and Optimizations for Shared Address Space Programs

✍ Arvind Krishnamurthy; Katherine Yelick 📂 Article 📅 1996 🏛 Elsevier Science 🌐 English ⚖ 423 KB

We present compiler analyses and optimizations for explicitly parallel programs that communicate through a shared address space. Any type of code motion on explicitly parallel programs requires a new kind of analysis to ensure that operations reordered on one processor cannot be observed by another.

Editorial: communication optimization fo

Editorial: communication optimization for scalable parallel system

✍ Ching-Hsien Hsu; Peter Sloot 📂 Article 📅 2012 🏛 Springer US 🌐 English ⚖ 155 KB

Communication-optimal parallel parenthes

Communication-optimal parallel parenthesis matching

✍ Chun-Hsi Huang; Xin He; Min Qian 📂 Article 📅 2006 🏛 Elsevier Science 🌐 English ⚖ 229 KB