We analyze and model the performance of heterogeneous parallel computing systems, where in general each node has a dierent computing power. The main features of our approach are: a simple but quite rigorous analysis; an `energetic' perspective on performance analysis, using concepts like the useful
Performance modeling and analysis of correlated parallel computations
โ Scribed by Wei-Ming Lin
- Publisher
- Elsevier Science
- Year
- 2008
- Tongue
- English
- Weight
- 330 KB
- Volume
- 34
- Category
- Article
- ISSN
- 0167-8191
No coin nor oath required. For personal study only.
โฆ Synopsis
A performance analysis methodology for correlated parallel computations based on statistical theory is proposed. Divide-and-conquer strategy is widely used in solving problems in parallel by partitioning and allocating a number of given tasks to available computing resources. When the tasks exhibit run-time-dependent behaviors during execution and share a universal distribution function in their execution times, analysis of parallel execution time can be performed with the assistance of probabilistic and statistical models. Correlation (dependence) in execution times among tasks has posed a significant factor in influencing the analysis accuracy which is unmanageable by any known analysis methodologies. We establish a relation between a task's or a processor's execution time and the parallel execution time, in terms of expected value as well as variance when each task's execution time can be closely modeled by a normal distribution, for either uncorrelated or correlated tasks. This relation is then applied to the modeling and analysis of various parallel computation paradigms in which different communication and synchronization patterns along the processing are present. The method proposed has a wider application scope and gives more accurate prediction results than previously known approaches. We also show that, as an extended application of the analysis method to a large scope of problems, load balance among processors can be vastly improved with some novel static task allocation technique in manipulating the correlation among tasks. Experimental results in analyzing a parallel tree search algorithm and two parallel sorting algorithms show very accurate analysis and prediction with the proposed method.
๐ SIMILAR VOLUMES
Kapelnikov is currently employed at the Information Services, Division of Citicorp, where he is responsible for the design and analysis of a heterogeneous computer network for a large-scale information delivery system. Previously, he had been involved in the design, modeling, and simulation of a var
Heydorn, S. and P. Weidner, Optimization and performance analysis of thinning algorithms on parallel computers, Parallel Computing 17 (1991) 17-27. This paper presents a concept for an implementation of different parallel thinning algorithms on parallel processors. The emphasis is put on a good para
A two-dimensional \((h, p)\) finite element scheme for distributed parallel computation is developed. The approach is based on an element-by-element domain decomposition and is implemented on the nCUBE2 system. Example problems are used to demonstrate performance of the algorithm for a range of \((h