<span>Reinforcement learning is a learning paradigm concerned with learning to control a system so as to maximize a numerical performance measure that expresses a long-term objective. What distinguishes reinforcement learning from supervised learning is that only partial feedback is given to the lea
Algorithms for Reinforcement Learning
✍ Scribed by Csaba Szepesvári
- Publisher
- Morgan & Claypool
- Year
- 2010
- Tongue
- English
- Leaves
- 103
- Series
- Synthesis Lectures on Artificial Intelligence and Machine Learning
- Category
- Library
No coin nor oath required. For personal study only.
✦ Synopsis
Reinforcement learning is a learning paradigm concerned with learning to control a system so as to maximize a numerical performance measure that expresses a long-term objective.What distinguishes reinforcement learning from supervised learning is that only partial feedback is given to the learner about the learner's predictions. Further, the predictions may have long term effects through influencing the future state of the controlled system. Thus, time plays a special role. The goal in reinforcement learning is to develop efficient learning algorithms, as well as to understand the algorithms' merits and limitations. Reinforcement learning is of great interest because of the large number of practical applications that it can be used to address, ranging from problems in artificial intelligence to operations research or control engineering. In this book, we focus on those algorithms of reinforcement learning that build on the powerful theory of dynamic programming.We give a fairly comprehensive catalog of learning problems, describe the core ideas, note a large number of state of the art algorithms, followed by the discussion of their theoretical properties and limitations.
✦ Table of Contents
Preface......Page 9
Acknowledgments......Page 13
Markov Decision Processes......Page 15
Value functions......Page 20
Dynamic programming algorithms for solving MDPs......Page 24
Tabular TD(0)......Page 25
Every-visit Monte-Carlo......Page 28
TD(): Unifying Monte-Carlo and TD(0)......Page 30
Algorithms for large state spaces......Page 32
TD() with function approximation......Page 36
Gradient temporal difference learning......Page 39
Least-squares methods......Page 41
The choice of the function space......Page 47
A catalog of learning problems......Page 51
Online learning in bandits......Page 52
Active learning in bandits......Page 54
Active learning in Markov Decision Processes......Page 55
Online learning in Markov Decision Processes......Page 56
Q-learning in finite MDPs......Page 61
Q-learning with function approximation......Page 63
Actor-critic methods......Page 66
Implementing a critic......Page 68
Implementing an actor......Page 70
Applications......Page 77
Software......Page 78
Contractions and Banach's fixed-point theorem......Page 79
Application to MDPs......Page 83
Bibliography......Page 87
Author's Biography......Page 103
✦ Subjects
Информатика и вычислительная техника;Искусственный интеллект;
📜 SIMILAR VOLUMES
Develop self-learning algorithms and agents using TensorFlow and other Python tools, frameworks, and libraries Key Features • Learn, develop, and deploy advanced reinforcement learning algorithms to solve a variety of tasks • Understand and develop model-free and model-based algorithms for buil
With this book, you will understand the core concepts and techniques of reinforcement learning. You will take a hands-on approach with each RL algorithm and will develop your own self-learning algorithms and models. You will optimize the algorithms for better precision, use high-speed actions and lo