𝔖 Bobbio Scriptorium
✦   LIBER   ✦

Two steps reinforcement learning

✍ Scribed by Fernando Fernández; Daniel Borrajo


Publisher
John Wiley and Sons
Year
2008
Tongue
English
Weight
977 KB
Volume
23
Category
Article
ISSN
0884-8173

No coin nor oath required. For personal study only.

✦ Synopsis


When applying reinforcement learning in domains with very large or continuous state spaces, the experience obtained by the learning agent in the interaction with the environment must be generalized. The generalization methods are usually based on the approximation of the value functions used to compute the action policy and tackled in two different ways. On the one hand by using an approximation of the value functions based on a supervized learning method. On the other hand, by discretizing the environment to use a tabular representation of the value functions. In this work, we propose an algorithm that uses both approaches to use the benefits of both mechanisms, allowing a higher performance. The approach is based on two learning phases. In the first one, a learner is used as a supervized function approximator, but using a machine learning technique which also outputs a state space discretization of the environment, such as nearest prototype classifiers or decision trees do. In the second learning phase, the space discretization computed in the first phase is used to obtain a tabular representation of the value function computed in the previous phase, allowing a tuning of such value function approximation. Experiments in different domains show that executing both learning phases improves the results obtained executing only the first one. The results take into account the resources used and the performance of the learned behavior.


📜 SIMILAR VOLUMES


Constructive reinforcement learning
✍ Jose Hernandez-Orallo 📂 Article 📅 2000 🏛 John Wiley and Sons 🌐 English ⚖ 163 KB

This paper presents an operative measure of reinforcement for constructive learning Ž . methods, i.e., eager learning methods using highly expressible or universal representation languages. These evaluation tools allow a further insight in the study of the growth of knowledge, theory revision, and a