A bandit process with delayed responses
β
Xikui Wang
π
Article
π
2000
π
Elsevier Science
π
English
β 73 KB
Structural properties on the boundary and monotonicity of optimal strategies in a two-armed bandit process with delayed responses are explored. Moreover, a ΓΏnite horizon optimal stopping solution is derived which complements a known result in the inΓΏnite horizon case.