A Non-Asymptotic Approach to Best-Arm Identification for Gaussian Bandits

Antoine BARRIER, Aurélien GARIVIER, Tomáš KOCÁK

1 April 2022

We propose a new strategy for best-arm identification with fixed confidence of Gaussian variables with bounded means and unit variance. This strategy, called Exploration-Biased Sampling, is not only asymptotically optimal: it is to the best of our knowledge the first strategy with non-asymptotic bounds that asymptotically matches the sample complexity. But the main advantage over other algorithms like Track-and-Stop is an improved behavior regarding exploration: Exploration-Biased Sampling is biased towards exploration in a subtle but natural way that makes it more stable and interpretable. These improvements are allowed by a new analysis of the sample complexity optimization problem, which yields a faster numerical resolution scheme and several quantitative regularity results that we believe of high independent interest."

Summary. An optional shortened abstract.

summary: “We propose a new strategy for best-arm identification with fixed confidence and show non-asymptotic bounds for Gaussian variables with bounded means and unit variance.

Antoine BARRIER

PostDoc in Medical Imaging

I’m interested in Medical Imaging techniques and in Optimization Algorithms in Sequential Learning.

A Non-Asymptotic Approach to Best-Arm Identification for Gaussian Bandits

Summary. An optional shortened abstract.

Antoine BARRIER

PostDoc in Medical Imaging

Aurélien GARIVIER

Full Professor

Tomáš KOCÁK

Research Assistant