Deep Reinforcement Learning with Python: With PyTorch, TensorFlow and OpenAI Gym

✍ Scribed by Nimish Sanghi

Publisher: Apress
Year: 2021
Tongue: English
Leaves: 394
Edition: 1
Category: Library

No coin nor oath required. For personal study only.

✦ Synopsis

Deep reinforcement learning is a fast-growing discipline that is making a significant impact in fields of autonomous vehicles, robotics, healthcare, finance, and many more. This book covers deep reinforcement learning using deep-q learning and policy gradient models with coding exercise.

You'll begin by reviewing the Markov decision processes, Bellman equations, and dynamic programming that form the core concepts and foundation of deep reinforcement learning. Next, you'll study model-free learning followed by function approximation using neural networks and deep learning. This is followed by various deep reinforcement learning algorithms such as deep q-networks, various flavors of actor-critic methods, and other policy-based methods.

You'll also look at exploration vs exploitation dilemma, a key consideration in reinforcement learning algorithms, along with Monte Carlo tree search (MCTS), which played a key role in the success of AlphaGo. The final chapters conclude with deep reinforcement learning implementation using popular deep learning frameworks such as TensorFlow and PyTorch. In the end, you'll understand deep reinforcement learning along with deep q networks and policy gradient models implementation with TensorFlow, PyTorch, and Open AI Gym.

What You'll Learn

Examine deep reinforcement learning
Implement deep learning algorithms using OpenAI’s Gym environment
Code your own game playing agents for Atari using actor-critic algorithms
Apply best practices for model building and algorithm training

Who This Book Is For

Machine learning developers and architects who want to stay ahead of the curve in the field of AI and deep learning.

✦ Table of Contents

Table of Contents
About the Author
About the Technical Reviewer
Acknowledgments
Introduction
Chapter 1: Introduction to Reinforcement Learning
Reinforcement Learning
Machine Learning Branches
Supervised Learning
Unsupervised Learning
Reinforcement Learning
Core Elements
Deep Learning with Reinforcement Learning
Examples and Case Studies
Autonomous Vehicles
Robots
Recommendation Systems
Finance and Trading
Healthcare
Game Playing
Libraries and Environment Setup
Alternate Way to Install Local Environment
Summary
Chapter 2: Markov Decision Processes
Definition of Reinforcement Learning
Agent and Environment
Rewards
Markov Processes
Markov Chains
Markov Reward Processes
Markov Decision Processes
Policies and Value Functions
Bellman Equations
Optimality Bellman Equations
Types of Solution Approaches with a Mind-Map
Summary
Chapter 3: Model-Based Algorithms
OpenAI Gym
Dynamic Programming
Policy Evaluation/Prediction
Policy Improvement and Iterations
Value Iteration
Generalized Policy Iteration
Asynchronous Backups
Summary
Chapter 4: Model-Free Approaches
Estimation/Prediction with Monte Carlo
Bias and Variance of MC Predication Methods
Control with Monte Carlo
Off-Policy MC Control
Temporal Difference Learning Methods
Temporal Difference Control
On-Policy SARSA
Q-Learning: An Off-Policy TD Control
Maximization Bias and Double Learning
Expected SARSA Control
Replay Buffer and Off-Policy Learning
Q-Learning for Continuous State Spaces
n-Step Returns
Eligibility Traces and TD(λ)
Relationships Between DP, MC, and TD
Summary
Chapter 5: Function Approximation
Introduction
Theory of Approximation
Coarse Coding
Tile Encoding
Challenges in Approximation
Incremental Prediction: MC, TD, TD(λ)
Incremental Control
Semi-gradient N-step SARSA Control
Semi-gradient SARSA(λ) Control
Convergence in Functional Approximation
Gradient Temporal Difference Learning
Batch Methods (DQN)
Linear Least Squares Method
Deep Learning Libraries
Summary
Chapter 6: Deep Q-Learning
Deep Q Networks
Atari Game-Playing Agent Using DQN
Prioritized Replay
Double Q-Learning
Dueling DQN
NoisyNets DQN
Categorical 51-Atom DQN (C51)
Quantile Regression DQN
Hindsight Experience Replay
Summary
Chapter 7: Policy Gradient Algorithms
Introduction
Pros and Cons of Policy-Based Methods
Policy Representation
Discrete Case
Continuous Case
Policy Gradient Derivation
Objective Function
Derivative Update Rule
Intuition Behind the Update Rule
REINFORCE Algorithm
Variance Reduction with Reward to Go
Further Variance Reduction with Baselines
Actor-Critic Methods
Defining Advantage
Advantage Actor Critic
Implementation of the A2C Algorithm
Asynchronous Advantage Actor Critic
Trust Region Policy Optimization Algorithm
Proximal Policy Optimization Algorithm
Summary
Chapter 8: Combining Policy Gradient and Q-Learning
Trade-Offs in Policy Gradient and Q-Learning
General Framework to Combine Policy Gradient with Q-Learning
Deep Deterministic Policy Gradient
Q-Learning in DDPG (Critic)
Policy Learning in DDPG (Actor)
Pseudocode and Implementation
Gym Environments Used in Code
Code Listing
Policy Network Actor (PyTorch)
Policy Network Actor (TensorFlow)
Q-Network Critic Implementation
PyTorch
TensorFlow
Combined Model-Actor Critic Implementation
Experience Replay
Q-Loss Implementation
PyTorch
TensorFlow
Policy Loss Implementation
One Step Update Implementation
DDPG: Main Loop
Twin Delayed DDPG
Target-Policy Smoothing
Q-Loss (Critic)
Policy Loss (Actor)
Delayed Update
Pseudocode and Implementation
Code Implementation
Combined Model-Actor Critic Implementation
Q-Loss Implementation
Policy-Loss Implementation
One-Step Update Implementation
TD3 Main Loop
Reparameterization Trick
Score/Reinforce Way
Reparameterization Trick and Pathwise Derivatives
Experiment
Entropy Explained
Soft Actor Critic
SAC vs. TD3
Q-Loss with Entropy-Regularization
Policy Loss with Reparameterization Trick
Pseudocode and Implementation
Code Implementation
Policy Network-Actor Implementation
Q-Network, Combined Model, and Experience Replay
Q-Loss and Policy-Loss Implementation
One-Step Update and SAC Main Loop
Summary
Chapter 9: Integrated Planning and Learning
Model-Based Reinforcement Learning
Planning with a Learned Model
Integrating Learning and Planning (Dyna)
Dyna Q and Changing Environments
Dyna Q+
Expected vs. Sample Updates
Exploration vs. Exploitation
Multi-arm Bandit
Regret: Measure of Quality of Exploration
Epsilon Greedy Exploration
Upper Confidence Bound Exploration
Thompson Sampling Exploration
Comparing Different Exploration Strategies
Planning at Decision Time and Monte Carlo Tree Search
AlphaGo Walk-Through
Summary
Chapter 10: Further Exploration and Next Steps
Model-Based RL: Additional Approaches
World Models
Imagination-Augmented Agents (I2A)
Model-Based RL with Model-Free Fine-Tuning (MBMF)
Model-Based Value Expansion (MBVE)
Imitation Learning and Inverse Reinforcement Learning
Derivative-Free Methods
Transfer Learning and Multitask Learning
Meta-Learning
Popular RL Libraries
How to Continue Studying
Summary
Index

📜 SIMILAR VOLUMES

Deep Reinforcement Learning with Python:

📁 Deep Reinforcement Learning with Python: With PyTorch, TensorFlow and OpenAI Gym

✍ Nimish Sanghi 📂 Library 📅 2021 🏛 Apress 🌐 English

<div><div><div>Deep reinforcement learning is a fast-growing discipline that is making a significant impact in fields of autonomous vehicles, robotics, healthcare, finance, and many more. This book covers deep reinforcement learning using deep-q learning and policy gradient models with coding exerci

Deep reinforcement learning with Python

📁 Deep reinforcement learning with Python : with Pytorch, Tensorflow and OpenAI Gym

✍ Nimish Sanghi 📂 Library 📅 2021 🏛 Apress 🌐 English

Applied Reinforcement Learning with Pyth

📁 Applied Reinforcement Learning with Python. With OpenAI Gym, Tensorflow and Keras

✍ Taweh Beysolow 📂 Library 📅 2019 🏛 Apress 🌐 English

Applied Reinforcement Learning with Pyth

📁 Applied Reinforcement Learning with Python: With OpenAI Gym, Tensorflow, and Keras

✍ Taweh Beysolow II 📂 Library 📅 2019 🏛 Apress 🌐 English

<p><p></p><p>Delve into the world of reinforcement learning algorithms and apply them to different use-cases via Python. This book covers important topics such as policy gradients and Q learning, and utilizes frameworks such as Tensorflow, Keras, and OpenAI Gym.</p><p></p><p><i><b>Applied Reinforcem

Applied Reinforcement Learning with Pyth

📁 Applied Reinforcement Learning with Python: With OpenAI Gym, Tensorflow, and Keras

✍ Beysolow II, Taweh 📂 Library 📅 2019 🏛 Apress L.P 🌐 English

Delve into the world of reinforcement learning algorithms and apply them to different use-cases via Python. This book covers important topics such as policy gradients and Q learning, and utilizes frameworks such as Tensorflow, Keras, and OpenAI Gym. Applied Reinforcement Learning with Python introdu

Applied Reinforcement Learning with Pyth

📁 Applied Reinforcement Learning with Python: With Openai Gym, Tensorflow, and Keras

✍ Taweh Beysolow 📂 Library 📅 2019 🏛 Apress 🌐 English