From Shortest Paths to Reinforcement Learning: A MATLAB-Based Tutorial on Dynamic Programming

✍ Scribed by Paolo Brandimarte

Publisher: Springer
Year: 2021
Tongue: English
Leaves: 216
Series: EURO Advanced Tutorials on Operational Research
Category: Library

No coin nor oath required. For personal study only.

✦ Synopsis

Dynamic programming (DP) has a relevant history as a powerful and flexible optimization principle, but has a bad reputation as a computationally impractical tool. This book fills a gap between the statement of DP principles and their actual software implementation. Using MATLAB throughout, this tutorial gently gets the reader acquainted with DP and its potential applications, offering the possibility of actual experimentation and hands-on experience. The book assumes basic familiarity with probability and optimization, and is suitable to both practitioners and graduate students in engineering, applied mathematics, management, finance and economics.

✦ Table of Contents

Preface
Contents
1 The Dynamic Programming Principle
1.1 What Is Dynamic Programming?
1.2 Dynamic Decision Problems
1.2.1 Finite Horizon, Discounted Problems
1.2.2 Infinite Horizon, Discounted Problems
1.2.3 Infinite Horizon, Average Contribution Per Stage Problems
1.2.4 Problems with an Undefined Horizon
1.2.5 Decision Policies
1.3 An Example: Dynamic Single-Item Lot-Sizing
1.4 A Glimpse of the DP Principle: The Shortest Path Problem
1.4.1 Forward vs. Backward DP
1.4.2 Shortest Paths on Structured Networks
1.4.3 Stochastic Shortest Paths
1.5 The DP Decomposition Principle
1.5.1 Stochastic DP for Finite Time Horizons
1.5.2 Stochastic DP for Infinite Time Horizons
1.6 For Further Reading
References
2 Implementing Dynamic Programming
2.1 Discrete Resource Allocation: The Knapsack Problem
2.2 Continuous Budget Allocation
2.2.1 Interlude: Function Interpolation by Cubic Splines in MATLAB
2.2.2 Solving the Continuous Budget Allocation Problem by Numerical DP
2.3 Stochastic Inventory Control
2.4 Exploiting Structure
2.4.1 Using Shortest Paths to Solve the Deterministic Lot-Sizing Problem
2.4.2 Stochastic Lot-Sizing: S and (s,S) Policies
2.4.3 Structural Properties of Value Functions
2.5 The Curses of Dynamic Programming
2.5.1 The Curse of State Dimensionality
2.5.2 The Curse of Optimization
2.5.3 The Curse of Expectation
2.5.4 The Curse of Modeling
2.6 For Further Reading
References
3 Modeling for Dynamic Programming
3.1 Finite Markov Decision Processes
3.2 Different Shades of Stochastic DP
3.2.1 Post-decision State Variables
3.3 Variations on Inventory Management
3.3.1 Deterministic Lead Time
3.3.2 Perishable Items
3.4 Revenue Management
3.4.1 Static Model with Perfect Demand Segmentation
3.4.2 Dynamic Model with Perfect Demand Segmentation
3.4.3 Dynamic Model with Customer Choice
3.5 Pricing Financial Options with Early Exercise Features
3.5.1 Bias Issues in Dynamic Programming
3.6 Consumption–Saving with Uncertain Labor Income
3.7 For Further Reading
References
4 Numerical Dynamic Programming for Discrete States
4.1 Discrete-Time Markov Chains
4.2 Markov Decision Processes with a Finite Time Horizon
4.2.1 A Numerical Example: Random Walks and Optimal Stopping
4.3 Markov Decision Processes with an Infinite Time Horizon
4.4 Value Iteration
4.4.1 A Numerical Example of Value Iteration
Speeding Up Value Iteration
4.5 Policy Iteration
4.5.1 A Numerical Example of Policy Iteration
4.6 Value vs. Policy Iteration
4.7 Average Contribution Per Stage
4.7.1 Relative Value Iteration for Problems Involving Average Contributions Per Stage
4.7.2 Policy Iteration for Problems Involving Average Contributions Per Stage
4.7.3 An Example of Application to Preventive Maintenance
4.8 For Further Reading
References
5 Approximate Dynamic Programming and Reinforcement Learning for Discrete States
5.1 Sampling and Estimation in Non-stationary Settings
5.1.1 The Exploration vs. Exploitation Tradeoff
5.1.2 Non-stationarity and Exponential Smoothing
5.2 Learning by Temporal Differences and SARSA
5.3 Q-Learning for Finite MDPs
5.3.1 A Numerical Example
5.4 For Further Reading
References
6 Numerical Dynamic Programming for Continuous States
6.1 Solving Finite Horizon Problems by Standard Numerical Methods
6.2 A Numerical Approach to Consumption–Saving
6.2.1 Approximating the Optimal Policy by Numerical DP
6.2.2 Optimizing a Fixed Policy
6.2.3 Computational Experiments
6.3 Computational Refinements and Extensions
6.4 For Further Reading
References
7 Approximate Dynamic Programming and Reinforcement Learning for Continuous States
7.1 Option Pricing by ADP and Linear Regression
7.2 A Basic Framework for ADP
7.3 Least-Squares Policy Iteration
7.4 For Further Reading
References
Index

📜 SIMILAR VOLUMES

From Shortest Paths to Reinforcement Lea

📁 From Shortest Paths to Reinforcement Learning: A MATLAB-Based Tutorial on Dynamic Programming

✍ Paolo Brandimarte 📂 Library 📅 2021 🏛 Springer International Publishing 🌐 English

📁 development effective individualized education programs: a case-based tutorial

📂 Library 🌐 English

Model-Based Reinforcement Learning: From

📁 Model-Based Reinforcement Learning: From Data to Continuous Actions with a Python-based Toolbox

✍ Jun Liu, Milad Farsi 📂 Library 📅 2023 🏛 Wiley-IEEE Press 🌐 English

Model-Based Reinforcement LearningExplore a comprehensive and practical approach to reinforcement learningReinforcement learning is an essential paradigm of machine learning, wherein an intelligent agent performs actions that ensure optimal behavior from dev

Model-Based Reinforcement Learning. From

📁 Model-Based Reinforcement Learning. From Data to Continuous Actions with a Python-based Toolbox

✍ Milad Farsi, Jun Liu 📂 Library 📅 2023 🏛 IEEE Press, Wiley Blackwell 🌐 English

Fundamental Concepts of MATLAB Programmi

📁 Fundamental Concepts of MATLAB Programming: From Learning the Basics to Solving a Problem with MATLAB

✍ Dr. Brijesh Bakariya; Dr. Kulwinder Singh Parmar 📂 Library 📅 2020 🏛 BPB Publications 🌐 English

Shortest Path Solvers. From Software to

📁 Shortest Path Solvers. From Software to Wetware

✍ Andrew Adamatzky 📂 Library 📅 2018 🏛 Springer International Publishing 🌐 English

This book offers advanced parallel and distributed algorithms and experimental laboratory prototypes of unconventional shortest path solvers. In addition, it presents novel and unique algorithms of solving shortest problems in massively parallel cellular automaton machines. The shortest path p