๐”– Bobbio Scriptorium
โœฆ   LIBER   โœฆ

Bounding reward measures of Markov models using the Markov decision processes

โœ Scribed by Peter Buchholz


Publisher
John Wiley and Sons
Year
2011
Tongue
English
Weight
388 KB
Volume
18
Category
Article
ISSN
1070-5325

No coin nor oath required. For personal study only.

โœฆ Synopsis


SUMMARY

For a Markov reward process, where upper and lower bounds for the transition rates and rewards are known, a new approach to bound the expected reward is presented. Based on a previous paper where sharp bounds have been defined for the problem, but only an inefficient and unstable algorithm is proposed, this paper presents algorithms to compute the bounds by interpreting the problem as a Markov Decision Process. In this way, the well known value and policy iteration algorithms can be adopted to compute reward bounds in a stable and fairly efficient way. Different numerical techniques are presented for computing the reward bounds. Copyright ยฉ 2011 John Wiley & Sons, Ltd.


๐Ÿ“œ SIMILAR VOLUMES


Experimental optimization of a real time
โœ Victor M. Saucedo; M. Nazmul Karim ๐Ÿ“‚ Article ๐Ÿ“… 1997 ๐Ÿ› John Wiley and Sons ๐ŸŒ English โš– 304 KB ๐Ÿ‘ 2 views

This article describes a methodology that implements a Markov decision process (MDP) optimization technique in a real time fed-batch experiment. Biological systems can be better modeled under the stochastic framework and MDP is shown to be a suitable technique for their optimization. A nonlinear inp

On-line monitoring of pharmaceutical pro
โœ Hui Zhang; Zhuangde Jiang; J.Y. Pi; H.K. Xu; R. Du ๐Ÿ“‚ Article ๐Ÿ“… 2009 ๐Ÿ› John Wiley and Sons ๐ŸŒ English โš– 505 KB

This article presents a new method for on-line monitoring of pharmaceutical production process, especially the powder blending process. The new method consists of two parts: extracting features from the Near Infrared (NIR) spectroscopy signals and recognizing patterns from the features. Features are