𝔖 Scriptorium
✦   LIBER   ✦

πŸ“

Principles of Parallel Scientific Computing

✍ Scribed by Tobias Weinzierl


Publisher
Springer
Year
2022
Tongue
English
Leaves
302
Edition
1
Category
Library

⬇  Acquire This Volume

No coin nor oath required. For personal study only.

✦ Synopsis


New insight in many scientific and engineering fields is unthinkable without the use of numerical simulations running efficiently on modern computers. The faster we get new results, the bigger and accurate are the problems that we can solve. It is the combination of mathematical ideas plus efficient programming that drives the progress in many disciplines. Future champions in the area thus will have to be qualified in their application domain, they will need a profound understanding of some mathematical ideas, and they need the skills to deliver fast code.

The present textbook targets students which have programming skills already and do not shy away from mathematics, though they might be educated in computer science or an application domain. It introduces the basic concepts and ideas behind applied mathematics and parallel programming that we need to write numerical simulations for today’s multicore workstations. Our intention is not to dive into one particular application domain or to introduce a new programming language – we lay the generic foundations for future courses and projects in the area.

The text is written in an accessible style which is easy to digest for students without years and years of mathematics education. It values clarity and intuition over formalism, and uses a simple N-body simulation setup to illustrate basic ideas that are of relevance in various different subdomains of scientific computing. Its primary goal is to make theoretical and paradigmatic ideas accessible to undergraduate students and to bring the fascination of the field across.

✦ Table of Contents


Preface
Why This Book
Mission Statement
Structure and Style
Acknowledgements
Learning and Teaching Mode
Contents
Part I Introduction: Why to Study the Subject
1 The Pillars of Science
1.1 A Third Pillar?
1.2 Computational X
1.3 About the Style of the Text and Some Shortcomings
2 Moore Myths
2.1 Moore's Law
2.2 Dennard Scaling
2.3 The Three Layers of Parallelism
3 Our Model Problem (Our First Encounter with the Explicit Euler)
3.1 The N-Body Problem
3.2 Time Discretisation: Running a Movie
3.3 Numerical Versus Analytical Solutions
Wrap-up
Part II How the Machine Works
4 Floating Point Numbers
4.1 Fixed Point Formats
4.2 Floating Point Numbers
4.2.1 Normalisation
4.2.2 The IEEE Format
4.2.3 Realising Floating-Point Support
4.3 Machine Precision
4.4 Programming Guidelines
5 A Simplistic Machine Model
5.1 Blueprint of a Computer's Execution Workflow
5.2 A Working SISD Example
5.3 Flaws
Wrap-up
Part III Floating Point Number Crunching
6 Round-Off Error Propagation
6.1 Inexact Arithmetics
6.2 Round-Off Error Analysis
6.3 Programming Recipes
7 SIMD Vector Crunching
7.1 SIMD in Flynn's Taxonomy
7.2 Suitable Code Snippets
7.3 Hardware Realisation Flavours
7.3.1 Large Vector Registers
7.3.2 Lockstepping
7.3.3 Branching
8 Arithmetic Stability of an Implementation
8.1 Arithmetic Stability
8.2 Stability of Some Example Problems
8.2.1 The N-Body Model Problem
8.2.2 Logistic Growth
8.2.3 Scalar Product
9 Vectorisation of the Model Problem
9.1 Compiler Feedback
9.2 Explicit Vectorisation with OpenMP
9.2.1 The simd Pragma
9.2.2 Loop Collapsing
9.2.3 Aliasing
9.2.4 Reductions
9.2.5 Functions
Wrap-up
Part IV Basic Numerical Techniques and Terms
10 Conditioning and Well-Posedness
10.1 Condition Number
10.2 The Condition Number for Linear Equation Systems
10.3 Backward Stability: Arithmetic Stability Revisited
11 Taylor Expansion
11.1 Taylor
11.2 Functions with Multiple Arguments
11.3 Applications of Taylor Expansion
12 Ordinary Differential Equations
12.1 Terminology
12.2 Attractive and Stable Solutions to ODEs
12.3 Approximation of ODEs Through Taylor Expansion
13 Accuracy and Appropriateness of Numerical Schemes
13.1 Stability
13.2 Convergence
13.3 Convergence Plots
Wrap-up
Part V Using a Multicore Computer
14 Writing Parallel Code
14.1 MIMD in Flynn's Taxonomy
14.2 BSP
14.3 Realisation of Our Approach
15 Upscaling Models and Scaling Measurements
15.1 Strong Scaling
15.2 Weak Scaling
15.3 Comparison
15.4 Data Presentation
16 OpenMP Primer: BSP on Multicores
16.1 A First (Working) OpenMP Code
16.2 Suitable For-Loops
16.3 Thread Information Exchange
16.3.1 Read and Write Access Protection
16.3.2 Variable Visibility
16.3.3 All-to-One Thread Synchronisation
16.4 Grain Sizes and Scheduling
17 Shared Memory Tasking
17.1 Task Graphs Revisited
17.2 Basics of OpenMP Tasking
17.2.1 The Task Dependency Graph and Ready Tasks
17.2.2 Task Synchronisation
17.2.3 Variations of the Example
17.3 Task Properties and Data Visibility
17.4 Task Dependencies
18 GPGPUs with OpenMP
18.1 Informal Sketch of GPU Hardware
18.1.1 Memory
18.1.2 Streaming Multiprocessors
18.1.3 Core Design
18.2 GPU Offloading
18.3 Multi-SM Codes
18.4 Data Movement and Management on GPUs
18.4.1 Moving Arrays to and From the Device
18.4.2 Data Management and Explicit Data Movements
18.4.3 Explicit data transfer
18.4.4 Overlapping data transfer and computations
18.5 Collectives
18.6 Functions and Structs on GPUs
Wrap-up
Part VI Faster and More Accurate Numerical Codes
19 Higher Order Methods
19.1 An Alternative Interpretation of Time Stepping
19.2 Adams-Bashforth
19.3 Runge-Kutta Methods
19.3.1 Accuracy
19.3.2 Cost and Parallelisation
19.4 Leapfrog
20 Adaptive Time Stepping
20.1 Motivation: A Very Simple Two-Grid Method
20.2 Formalisation
20.3 Error Estimators and Adaptivity Strategies
20.3.1 Adaptive Time Stepping
20.3.2 Local Time Stepping
20.3.3 p-Refinement
20.4 Hard-Coded Refinement Strategies
Wrap-up
A Using the Text
A.1 Course Organisation
A.2 What I Have Not Covered (But Maybe Should Have Done)
A.3 Notation Used Throughout the Book
A.4 Useful Software
B Cheat Sheet: System Benchmarking
B.1 Looking Up Your Hardware
B.2 Compute the Theoretical Capability (Peak Performance)
B.3 Determine Effective Bandwidth
C Cheat Sheet: Performance Assessment
C.1 Compiler Options
C.2 Getting OpenMP Right
C.3 Assessing the Code
C.3.1 Code Characterisation
C.3.2 Diving into the Code
C.3.3 Next Steps
D Cheat Sheet: Calibrating the Upscaling Models
D.1 Two-Point Calibration
D.2 Runtime Normalisation
E Cheat Sheet: Convergence and Stability Studies
E.1 Construct Test Scenarios
E.2 Choosing the Right Norms
E.3 Interpreting Your Data
F Cheat Sheet: Data Presentation
F.1 Tables
F.2 Graphs
F.3 Formulae
F.4 Integrating Figures and Tables into Your Write-Up
Index


πŸ“œ SIMILAR VOLUMES


Parallel Scientific Computing
✍ FrΓ©dΓ©ric Magoules, Fran?ois-Xavier Roux, Guillaume Houzeaux πŸ“‚ Library πŸ“… 2015 πŸ› Wiley-ISTE 🌐 English

<p>Scientific computing has become an indispensable tool in numerous fields, such as physics, mechanics, biology,<br />finance and industry. For example, it enables us, thanks to efficient algorithms adapted to current computers, to<br />simulate, without the help of models or experimentations, the

Parallel processing for scientific compu
✍ Padma Raghavan, and Horst D. Simon Edited by Michael A. Heroux πŸ“‚ Library πŸ“… 2006 πŸ› Society for Industrial and Applied Mathematics 🌐 English

Software, Environments, and Tools 20 <P> Scientific computing has often been called the third approach to scientific discovery, emerging as a peer to experimentation and theory. Historically, the synergy between experimentation and theory has been well understood: experiments give insight into po