𝔖 Scriptorium
✦   LIBER   ✦

πŸ“

From Shortest Paths to Reinforcement Learning: A MATLAB-Based Tutorial on Dynamic Programming

✍ Scribed by Paolo Brandimarte


Publisher
Springer
Year
2021
Tongue
English
Leaves
216
Series
EURO Advanced Tutorials on Operational Research
Category
Library

⬇  Acquire This Volume

No coin nor oath required. For personal study only.

✦ Synopsis


Dynamic programming (DP) has a relevant history as a powerful and flexible optimization principle, but has a bad reputation as a computationally impractical tool. This book fills a gap between the statement of DP principles and their actual software implementation. Using MATLAB throughout, this tutorial gently gets the reader acquainted with DP and its potential applications, offering the possibility of actual experimentation and hands-on experience. The book assumesΒ basic familiarity with probability and optimization, and is suitable to both practitioners and graduate students in engineering, applied mathematics, management, finance and economics.

✦ Table of Contents


Preface
Contents
1 The Dynamic Programming Principle
1.1 What Is Dynamic Programming?
1.2 Dynamic Decision Problems
1.2.1 Finite Horizon, Discounted Problems
1.2.2 Infinite Horizon, Discounted Problems
1.2.3 Infinite Horizon, Average Contribution Per Stage Problems
1.2.4 Problems with an Undefined Horizon
1.2.5 Decision Policies
1.3 An Example: Dynamic Single-Item Lot-Sizing
1.4 A Glimpse of the DP Principle: The Shortest Path Problem
1.4.1 Forward vs. Backward DP
1.4.2 Shortest Paths on Structured Networks
1.4.3 Stochastic Shortest Paths
1.5 The DP Decomposition Principle
1.5.1 Stochastic DP for Finite Time Horizons
1.5.2 Stochastic DP for Infinite Time Horizons
1.6 For Further Reading
References
2 Implementing Dynamic Programming
2.1 Discrete Resource Allocation: The Knapsack Problem
2.2 Continuous Budget Allocation
2.2.1 Interlude: Function Interpolation by Cubic Splines in MATLAB
2.2.2 Solving the Continuous Budget Allocation Problem by Numerical DP
2.3 Stochastic Inventory Control
2.4 Exploiting Structure
2.4.1 Using Shortest Paths to Solve the Deterministic Lot-Sizing Problem
2.4.2 Stochastic Lot-Sizing: S and (s,S) Policies
2.4.3 Structural Properties of Value Functions
2.5 The Curses of Dynamic Programming
2.5.1 The Curse of State Dimensionality
2.5.2 The Curse of Optimization
2.5.3 The Curse of Expectation
2.5.4 The Curse of Modeling
2.6 For Further Reading
References
3 Modeling for Dynamic Programming
3.1 Finite Markov Decision Processes
3.2 Different Shades of Stochastic DP
3.2.1 Post-decision State Variables
3.3 Variations on Inventory Management
3.3.1 Deterministic Lead Time
3.3.2 Perishable Items
3.4 Revenue Management
3.4.1 Static Model with Perfect Demand Segmentation
3.4.2 Dynamic Model with Perfect Demand Segmentation
3.4.3 Dynamic Model with Customer Choice
3.5 Pricing Financial Options with Early Exercise Features
3.5.1 Bias Issues in Dynamic Programming
3.6 Consumption–Saving with Uncertain Labor Income
3.7 For Further Reading
References
4 Numerical Dynamic Programming for Discrete States
4.1 Discrete-Time Markov Chains
4.2 Markov Decision Processes with a Finite Time Horizon
4.2.1 A Numerical Example: Random Walks and Optimal Stopping
4.3 Markov Decision Processes with an Infinite Time Horizon
4.4 Value Iteration
4.4.1 A Numerical Example of Value Iteration
Speeding Up Value Iteration
4.5 Policy Iteration
4.5.1 A Numerical Example of Policy Iteration
4.6 Value vs. Policy Iteration
4.7 Average Contribution Per Stage
4.7.1 Relative Value Iteration for Problems Involving Average Contributions Per Stage
4.7.2 Policy Iteration for Problems Involving Average Contributions Per Stage
4.7.3 An Example of Application to Preventive Maintenance
4.8 For Further Reading
References
5 Approximate Dynamic Programming and Reinforcement Learning for Discrete States
5.1 Sampling and Estimation in Non-stationary Settings
5.1.1 The Exploration vs. Exploitation Tradeoff
5.1.2 Non-stationarity and Exponential Smoothing
5.2 Learning by Temporal Differences and SARSA
5.3 Q-Learning for Finite MDPs
5.3.1 A Numerical Example
5.4 For Further Reading
References
6 Numerical Dynamic Programming for Continuous States
6.1 Solving Finite Horizon Problems by Standard Numerical Methods
6.2 A Numerical Approach to Consumption–Saving
6.2.1 Approximating the Optimal Policy by Numerical DP
6.2.2 Optimizing a Fixed Policy
6.2.3 Computational Experiments
6.3 Computational Refinements and Extensions
6.4 For Further Reading
References
7 Approximate Dynamic Programming and Reinforcement Learning for Continuous States
7.1 Option Pricing by ADP and Linear Regression
7.2 A Basic Framework for ADP
7.3 Least-Squares Policy Iteration
7.4 For Further Reading
References
Index


πŸ“œ SIMILAR VOLUMES


Model-Based Reinforcement Learning: From
✍ Jun Liu, Milad Farsi πŸ“‚ Library πŸ“… 2023 πŸ› Wiley-IEEE Press 🌐 English

<span>Model-Based Reinforcement Learning</span><p><span>Explore a comprehensive and practical approach to reinforcement learning</span></p><p><span>Reinforcement learning is an essential paradigm of machine learning, wherein an intelligent agent performs actions that ensure optimal behavior from dev

Shortest Path Solvers. From Software to
✍ Andrew Adamatzky πŸ“‚ Library πŸ“… 2018 πŸ› Springer International Publishing 🌐 English

<p><p>This book offers advanced parallel and distributed algorithms and experimental laboratory prototypes of unconventional shortest path solvers. In addition, it presents novel and unique algorithms of solving shortest problems in massively parallel cellular automaton machines. The shortest path p