๐”– Bobbio Scriptorium
โœฆ   LIBER   โœฆ

Architectures and message-passing algorithms for cluster computing: Design and performance

โœ Scribed by Edward K. Blum; Xin Wang; Patrick Leung


Publisher
Elsevier Science
Year
2000
Tongue
English
Weight
239 KB
Volume
26
Category
Article
ISSN
0167-8191

No coin nor oath required. For personal study only.

โœฆ Synopsis


This paper considers the architecture of clusters and related message-passing (MP) software algorithms and their eect on performance (speedup and eciency) of cluster computing (CC). We present new architectures for multi-segment Ethernet clusters and new MP algorithms which ยฎt these architectures. The multiple segments (e.g. commodity hubs) connect commodity processor nodes so as to allow MP to be highly parallelized by avoiding network contention and collisions in many applications where the all-gather and other collective operations are central. We analyze all-gather in some detail, and present new network topologies and new MP algorithms to minimize latency. The new topologies are based on a design, called two-by-four nets 2 ร‚ 4 nets, by Compbionics. An integrated MP software system, called Reduced Overhead Cluster Communication (ROCC), which embodies the MP algorithms is also described. In brief, 2 ร‚ 4 nets are networks of ``supernodes'', called 2 ร‚ 4's, each having 4 processors on 2 segments and segments usually being Ethernet hubs. The supernodes are typically connected to form rings or tori of supernodes. We present actual test results and supporting analyses to demonstrate that 2 ร‚ 4 nets with the ROCC MP software are faster than many existing clusters and generally less costly.


๐Ÿ“œ SIMILAR VOLUMES


A Multithreaded Message Passing Interfac
โœ Boris V. Protopopov; Anthony Skjellum ๐Ÿ“‚ Article ๐Ÿ“… 2001 ๐Ÿ› Elsevier Science ๐ŸŒ English โš– 222 KB

This paper discusses a multithreaded software architecture for messagepassing interface (MPI) software specification. The architecture is thread-safe, allows for concurrent communication over several communications media (multifabric communication), efficiently utilizes available hardware concurrenc

Enhanced distributed computing message p
โœ C. J. Gillan; V. Fusco ๐Ÿ“‚ Article ๐Ÿ“… 1999 ๐Ÿ› John Wiley and Sons ๐ŸŒ English โš– 91 KB ๐Ÿ‘ 1 views

The "nite di!erence time domain method (FDTD) solves Maxwell's equations by employing numerically and storage intensive computation to map the electric and magnetic "elds within a "nite volume as an explicit function of time. Distributed computation, using heterogeneous networks of computers, is a c

Performance modeling for SPMD message-pa
โœ BREHM, JรœRGEN; WORLEY, PATRICK H.; MADHUKAR, MANISH ๐Ÿ“‚ Article ๐Ÿ“… 1998 ๐Ÿ› John Wiley and Sons ๐ŸŒ English โš– 279 KB ๐Ÿ‘ 2 views

Today's massively parallel machines are typically message-passing systems consisting of hundreds or thousands of processors. Implementing parallel applications efficiently in this environment is a challenging task, and poor parallel design decisions can be expensive to correct. Tools and techniques

Ring, torus and hypercube architectures/
โœ S. Lakshmivarahan; Sudarshan K. Dhall ๐Ÿ“‚ Article ๐Ÿ“… 1999 ๐Ÿ› Elsevier Science ๐ŸŒ English โš– 371 KB

This paper provides a survey of both architectural and algorithmic aspects of solving problems using parallel processors with ring, torus and hypercube interconnection.