๐”– Scriptorium
โœฆ   LIBER   โœฆ

๐Ÿ“

Simulation-based Algorithms for Markov Decision Processes (Communications and Control Engineering)

โœ Scribed by Hyeong Soo Chang, Michael C. Fu, Jiaqiao Hu, Steven I. Marcus,


Year
2007
Tongue
English
Leaves
202
Edition
1st Edition.
Category
Library

โฌ‡  Acquire This Volume

No coin nor oath required. For personal study only.

โœฆ Synopsis


Markov decision process (MDP) models are widely used for modeling sequential decision-making problems that arise in engineering, economics, computer science, and the social sciences. This book brings the state-of-the-art research together for the first time. It provides practical modeling methods for many real-world problems with high dimensionality or complexity which have not hitherto been treatable with Markov decision processes.

โœฆ Table of Contents


Contents......Page 11
Selected Notation and Abbreviations......Page 14
1. Markov Decision Processes......Page 17
1.1 Optimality Equations......Page 19
1.2 Policy Iteration and Value Iteration......Page 21
1.3 Rolling-horizon Control......Page 23
1.4 Survey of Previous Work on Computational Methods......Page 24
1.5 Simulation......Page 27
1.6 Preview of Coming Attractions......Page 30
1.7 Notes......Page 31
2. Multi-stage Adaptive Sampling Algorithms......Page 33
2.1.1 Regret Analysis in Multi-armed Bandits......Page 35
2.1.2 Algorithm Description......Page 36
2.1.3 Alternative Estimators......Page 37
2.1.4 Convergence Analysis......Page 40
2.1.5 Numerical Example......Page 47
2.2 Pursuit Learning Automata Sampling......Page 56
2.2.1 Algorithm Description......Page 57
2.2.2 Convergence Analysis......Page 58
2.2.3 Application to POMDPs......Page 67
2.2.4 Numerical Example......Page 69
2.3 Notes......Page 75
3. Population-based Evolutionary Approaches......Page 76
3.1.1 Policy Switching......Page 78
3.1.3 Stopping Rule......Page 80
3.1.4 Convergence Analysis......Page 81
3.2 Evolutionary Random Policy Search......Page 82
3.2.1 Policy Improvement with Reward Swapping......Page 83
3.2.2 Exploration......Page 86
3.2.3 Convergence Analysis......Page 88
3.3.1 A One-dimensional Queueing Example......Page 91
3.3.2 A Two-dimensional Queueing Example......Page 100
3.5 Notes......Page 102
4. Model Reference Adaptive Search......Page 104
4.1 The Model Reference Adaptive Search Method......Page 106
4.1.1 The MRAS[sub(0)] Algorithm (Idealized Version)......Page 108
4.1.2 The MRAS[sub(1)] Algorithm (Adaptive Monte Carlo Version)......Page 111
4.1.3 The MRAS[sub(2)] Algorithm (Stochastic Optimization)......Page 113
4.2.1 MRAS[sub(0)] Convergence......Page 116
4.2.2 MRAS[sub(1)] Convergence......Page 122
4.2.3 MRAS[sub(2)] Convergence......Page 131
4.3 Application to MDPs via Direct Policy Learning......Page 144
4.3.2 Infinite-horizon MDPs......Page 145
4.3.4 Numerical Examples......Page 147
4.4.1 Algorithm Description......Page 156
4.4.2 Numerical Examples......Page 158
4.5 Application to Finite-horizon MDPs Using Adaptive Sampling......Page 161
4.6 Notes......Page 163
5. On-line Control Methods via Simulation......Page 164
5.1 Simulated Annealing Multiplicative Weights Algorithm......Page 168
5.1.1 Basic Algorithm Description......Page 169
5.1.2 Convergence Analysis......Page 170
5.1.3 Convergence of the Sampling Version of the Algorithm......Page 173
5.1.4 Numerical Example......Page 175
5.1.5 Simulated Policy Switching......Page 179
5.2 Rollout......Page 180
5.2.1 Parallel Rollout......Page 181
5.3 Hindsight Optimization......Page 183
5.3.1 Numerical Example......Page 184
5.4 Notes......Page 189
References......Page 191
E......Page 200
P......Page 201
W......Page 202


๐Ÿ“œ SIMILAR VOLUMES


Simulation-Based Algorithms for Markov D
โœ Hyeong Soo Chang, Jiaqiao Hu, Michael C. Fu, Steven I. Marcus (auth.) ๐Ÿ“‚ Library ๐Ÿ“… 2013 ๐Ÿ› Springer-Verlag London ๐ŸŒ English

<p>Markov decision process (MDP) models are widely used for modeling sequential decision-making problems that arise in engineering, economics, computer science, and the social sciences. Many real-world problems modeled by MDPs have huge state and/or action spaces, giving an opening to the curse of d

Simulation-based Algorithms for Markov D
โœ Hyeong Soo Chang, Michael C. Fu, Jiaqiao Hu, Steven I. Marcus ๐Ÿ“‚ Library ๐Ÿ“… 2007 ๐Ÿ› Springer ๐ŸŒ English

Markov decision process (MDP) models are widely used for modeling sequential decision-making problems that arise in engineering, economics, computer science, and the social sciences. It is well-known that many real-world problems modeled by MDPs have huge state and/or action spaces, leading to the n

Markov Chains and Decision Processes for
โœ Theodore J. Sheskin (Author) ๐Ÿ“‚ Library ๐Ÿ“… 2011 ๐Ÿ› CRC Press

<p>Recognized as a powerful tool for dealing with uncertainty, Markov modeling can enhance your ability to analyze complex production and service systems. However, most books on Markov chains or decision processes are often either highly theoretical, with few examples, or highly prescriptive, with l

Markov Chains and Decision Processes for
โœ Theodore J. Sheskin ๐Ÿ“‚ Library ๐Ÿ“… 2010 ๐Ÿ› CRC Press ๐ŸŒ English

<P>Recognized as a powerful tool for dealing with uncertainty, Markov modeling can enhance your ability to analyze complex production and service systems. However, most books on Markov chains or decision processes are often either highly theoretical, with few examples, or highly prescriptive, with l