๐”– Bobbio Scriptorium
โœฆ   LIBER   โœฆ

Global Optimization for Mapping Parallel Image Processing Tasks on Distributed Memory Machines

โœ Scribed by Cheolwhan Lee; Yuan-Fang Wang; Tao Yang


Publisher
Elsevier Science
Year
1997
Tongue
English
Weight
565 KB
Volume
45
Category
Article
ISSN
0743-7315

No coin nor oath required. For personal study only.

โœฆ Synopsis


Many parallel algorithms and library routines for computer vision and image processing (CVIP) tasks on distributed-memory multiprocessors are available. The typical image distribution may use column, row, and block based mapping. Integrating a set of library routines for a CVIP application requires a global optimization to determine the data mapping of individual tasks by considering inter-task communication. The main difficulty in deriving the optimal image data distribution for each task is that CVIP task computation may involve loops, and the number of processors available and the size of the input image may vary at the run time. In this paper, a CVIP application is modeled using a task chain with imperfectly nested loops, specified by conventional visual languages such as Khoros and Explorer. A mapping algorithm is proposed that optimizes the average runtime performance for CVIP applications with nested loops by considering the data redistribution overheads and possible runtime parameter variations. A taxonomy of CVIP operations is provided and used for further reducing the complexity of the algorithm. Experimental results on both low-level image processing and high-level computer vision applications are presented to validate this approach.


๐Ÿ“œ SIMILAR VOLUMES


Compiler Algorithms for Optimizing Local
โœ M. Kandemir; J. Ramanujam; A. Choudhary ๐Ÿ“‚ Article ๐Ÿ“… 2000 ๐Ÿ› Elsevier Science ๐ŸŒ English โš– 608 KB

Distributed-memory message-passing machines deliver scalable performance but are difficult to program. Shared-memory machines, on the other hand, are easier to program but obtaining scalable performance with large number of processors is difficult. Recently, scalable machines based on logically shar