๐”– Bobbio Scriptorium
โœฆ   LIBER   โœฆ

New algorithms of the Q-learning type

โœ Scribed by Shalabh Bhatnagar; K. Mohan Babu


Book ID
104002886
Publisher
Elsevier Science
Year
2008
Tongue
English
Weight
284 KB
Volume
44
Category
Article
ISSN
0005-1098

No coin nor oath required. For personal study only.

โœฆ Synopsis


We propose two algorithms for Q-learning that use the two-timescale stochastic approximation methodology. The first of these updates Q-values of all feasible state-action pairs at each instant while the second updates Q-values of states with actions chosen according to the 'current' randomized policy updates. A proof of convergence of the algorithms is shown. Finally, numerical experiments using the proposed algorithms on an application of routing in communication networks are presented on a few different settings.


๐Ÿ“œ SIMILAR VOLUMES