✦ LIBER ✦

Vector-valued Markov decision processes and the systems of linear inequalities

✍ Scribed by Kazuyoshi Wakuta

Publisher: Elsevier Science
Year: 1995
Tongue: English
Weight: 496 KB
Volume: 56
Category: Article
ISSN: 0304-4149
DOI: 10.1016/0304-4149(94)00064-z

No coin nor oath required. For personal study only.

📜 SIMILAR VOLUMES

On systems of vector-valued linear inequ

On systems of vector-valued linear inequalities

✍ Hidetoshi Komiya 📂 Article 📅 1985 🏛 Elsevier Science 🌐 English ⚖ 386 KB

A New Class of Policies in Vector-Valued

A New Class of Policies in Vector-Valued Markov Decision Processes

✍ Kazuyoshi Wakuta 📂 Article 📅 1996 🏛 Elsevier Science 🌐 English ⚖ 108 KB

For a vector-valued Markov decision process with discounted reward criterion, we introduce a new class of policies called the semi-stationary policies and show that an optimal semi-stationary policy that attains the extreme points of the set of rewards induced by all policies can be described as a c

Optimal stationary policies in the vecto

Optimal stationary policies in the vector-valued Markov decision process

✍ Kazuyoshi Wakuta 📂 Article 📅 1992 🏛 Elsevier Science 🌐 English ⚖ 429 KB

The Convergence of Value Iteration in Di

The Convergence of Value Iteration in Discounted Markov Decision Processes

✍ D.J. White; W.T. Scherer 📂 Article 📅 1994 🏛 Elsevier Science 🌐 English ⚖ 435 KB

On the existence of relative values for

On the existence of relative values for undiscounted multichain Markov decision processes

✍ Paul J. Schweitzer 📂 Article 📅 1984 🏛 Elsevier Science 🌐 English ⚖ 285 KB

Bounding reward measures of Markov model

Bounding reward measures of Markov models using the Markov decision processes

✍ Peter Buchholz 📂 Article 📅 2011 🏛 John Wiley and Sons 🌐 English ⚖ 388 KB

## SUMMARY For a Markov reward process, where upper and lower bounds for the transition rates and rewards are known, a new approach to bound the expected reward is presented. Based on a previous paper where sharp bounds have been defined for the problem, but only an inefficient and unstable algorit