A New Class of Policies in Vector-Valued
β
Kazuyoshi Wakuta
π
Article
π
1996
π
Elsevier Science
π
English
β 108 KB
For a vector-valued Markov decision process with discounted reward criterion, we introduce a new class of policies called the semi-stationary policies and show that an optimal semi-stationary policy that attains the extreme points of the set of rewards induced by all policies can be described as a c