Information-Theoretic Regret Bounds for
โ
Srinivas, N.; Krause, A.; Kakade, S.M.; Seeger, M.
๐
Article
๐
2012
๐
IEEE
๐
English
โ 963 KB
Many applications require optimizing an unknown, noisy function that is expensive to evaluate. We formalize this task as a multiarmed bandit problem, where the payoff function is either sampled from a Gaussian process (GP) or has low norm in a reproducing kernel Hilbert space. We resolve the importa