<p><P>Hardware Acceleration of EDA Algorithms: Custom ICs, FPGAs and GPUs</P><P></P><P>Kanupriya Gulati</P><P>Sunil P. Khatri</P><P></P><P><BR>This book deals with the acceleration of EDA algorithms using hardware platforms such as Custom ICs, FPGAs and GPUs. Widely applied CAD algorithms are studie
Hardware Acceleration of EDA Algorithms
โ Scribed by Khatri
- Publisher
- Springer
- Year
- 2010
- Tongue
- English
- Leaves
- 207
- Category
- Library
No coin nor oath required. For personal study only.
โฆ Synopsis
Single-threaded software applications have ceased to see signi?cant gains in p- formance on a general-purpose CPU, even with further scaling in very large scale integration (VLSI) technology. This is a signi?cant problem for electronic design automation (EDA) applications, since the design complexity of VLSI integrated circuits (ICs) is continuously growing. In this research monograph, we evaluate custom ICs, ?eld-programmable gate arrays (FPGAs), and graphics processors as platforms for accelerating EDA algorithms, instead of the general-purpose sing- threaded CPU. We study applications which are used in key time-consuming steps of the VLSI design ?ow. Further, these applications also have different degrees of inherent parallelism in them. We study both control-dominated EDA applications and control plus data parallel EDA applications. We accelerate these applications on these different hardware platforms. We also present an automated approach for accelerating certain uniprocessor applications on a graphics processor. This monograph compares custom ICs, FPGAs, and graphics processing units (GPUs) as potential platforms to accelerate EDA algorithms. It also provides details of the programming model used for interfacing with the GPUs.
โฆ Table of Contents
Foreword
Preface
Acknowledgments
Contents
List of Tables
List of Figures
1 Introduction
1.1 Hardware Platforms Considered in This Research Monograph
1.2 EDA Algorithms Studied in This Research Monograph
1.2.1 Control-Dominated Applications
1.2.2 Control Plus Data Parallel Applications
1.3 Automated Approach for GPU-Based Software Acceleration
1.4 Chapter Summary
References
Part I Alternative Hardware Platforms
2 Hardware Platforms
2.1 Chapter Overview
2.2 Introduction
2.3 Hardware Platforms Studied in This Research Monograph
2.3.1 Custom ICs
2.3.2 FPGAs
2.3.3 Graphics Processors
2.4 General Overview and Architecture
2.5 Programming Model and Environment
2.6 Scalability
2.7 Design Turn-Around Time
2.8 Performance
2.9 Cost of Hardware
2.10 Floating Point Operations
2.11 Security and Real-Time Applications
2.12 Applications
2.13 Chapter Summary
References
3 GPU Architecture and the CUDA Programming Model
3.1 Chapter Overview
3.2 Introduction
3.3 Hardware Model
3.4 Memory Model
3.5 Programming Model
3.6 Chapter Summary
References
Part II Control-Dominated Category
4 Accelerating Boolean Satisfiability on a Custom IC
4.1 Chapter Overview
4.2 Introduction
4.3 Previous Work
4.4 Hardware Architecture
4.4.1 Abstract Overview
4.4.2 Hardware Overview
4.4.3 Hardware Details
4.4.3.1 Decision Engine
4.4.3.2 Clause Cell
4.4.3.3 Base Cell
4.4.3.4 Partitioning the Hardware
4.4.3.5 Inter-bank Communication
4.5 An Example of Conflict Clause Generation
4.6 Partitioning the CNF Instance
4.7 Extraction of the Unsatisfiable Core
4.8 Experimental Results
4.9 Chapter Summary
References
5 Accelerating Boolean Satisfiability on an FPGA
5.1 Chapter Overview
5.2 Introduction
5.3 Previous Work
5.4 Hardware Architecture
5.4.1 Architecture Overview
5.5 Solving a CNF Instance Which Is Partitioned into Several Bins
5.6 Partitioning the CNF Instance
5.7 Hardware Details
5.8 Experimental Results
5.8.1 Current Implementation
5.8.2 Performance Model
5.8.2.1 FPGA Resources
5.8.2.2 Clauses/Variable Ratio
5.8.2.3 Cycles Versus Bin Size
5.8.2.4 Bins Touched Versus Bin Size
5.8.2.5 Bin Size
5.8.3 Projections
5.9 Chapter Summary
References
6 Accelerating Boolean Satisfiability on a Graphics Processing Unit
6.1 Chapter Overview
6.2 Introduction
6.3 Related Previous Work
6.4 Our Approach
6.4.1 SurveySAT and the GPU
6.4.1.1 SurveySAT
6.4.1.2 SurveySAT on the GPU
6.4.1.3 SurveySAT Results on the GPU
6.4.2 MiniSAT Enhanced with Survey Propagation (MESP)
6.5 Experimental Results
6.6 Chapter Summary
References
Part III Control Plus Data Parallel Applications
7 Accelerating Statistical Static Timing Analysis Using Graphics Processors
7.1 Chapter Overview
7.2 Introduction
7.3 Previous Work
7.4 Our Approach
7.4.1 Static Timing Analysis (STA) at a Gate
7.4.2 Statistical Static Timing Analysis (SSTA) at a Gate
7.5 Experimental Results
7.6 Chapter Summary
References
8 Accelerating Fault Simulation Using Graphics Processors
8.1 Chapter Overview
8.2 Introduction
8.3 Previous Work
8.4 Our Approach
8.4.1 Logic Simulation at a Gate
8.4.2 Fault Injection at a Gate
8.4.3 Fault Detection at a Gate
8.4.4 Fault Simulation of a Circuit
8.5 Experimental Results
8.6 Chapter Summary
References
9 Fault Table Generation Using Graphics Processors
9.1 Chapter Overview
9.2 Introduction
9.3 Previous Work
9.4 Our Approach
9.4.1 Definitions
9.4.2 Algorithms: FSIM* and GFTABLE
9.4.2.1 Generating Vectors (Line 9)
9.4.2.2 Fault-Free Simulation (Line 10)
9.4.2.3 Computing Detectabilities and Cumulative Detectabilities (Lines 13, 14)
9.4.2.4 Fault Simulation of SR(s) (Lines 15, 16)
9.4.2.5 Generating the Fault Table (Lines 22--31)
9.5 Experimental Results
9.6 Chapter Summary
References
10 Accelerating Circuit Simulation Using Graphics Processors
10.1 Chapter Overview
10.2 Introduction
10.3 Previous Work
10.4 Our Approach
10.4.1 Parallelizing BSIM3 Model Computations on a GPU
10.4.1.1 Inlining if--then--else Code
10.4.1.2 Partitioning the BSIM3 Code into Kernels
10.4.1.3 Efficient Use of GPU Memory Model
10.4.1.4 Thread Scheduler and Code Statistics
10.5 Experiments
10.6 Chapter Summary
References
Part IV Automated Generation of GPU Code
11 Automated Approach for Graphics Processor Based Software Acceleration
11.1 Chapter Overview
11.2 Introduction
11.3 Our Approach
11.3.1 Problem Definition
11.3.2 GPU Constraints on the Kernel Generation Engine
11.3.3 Automatic Kernel Generation Engine
11.3.3.1 Node Duplication
11.3.3.2 Cost of a Partitioning Solution
11.4 Experimental Results
11.4.1 Evaluation Methodology
11.5 Chapter Summary
References
12 Conclusions
References
Index
๐ SIMILAR VOLUMES
This book describes algorithms and hardware implementations of computer holography, especially in terms of fast calculation. It summarizes the basics of holography and computer holography and describes how conventional diffraction calculations play a central role. Numerical implementations by actual
<p><p>This book suggests and describes a number of fast parallel circuits for data/vector processing using FPGA-based hardware accelerators. Three primary areas are covered: searching, sorting, and counting in combinational and iterative networks. These include the application of traditional structu