Contributions to a Theory of Pure Exploration in Sequential Statistics

This thesis lies in the fields of artificial intelligence, sequential statistics and optimization. We focus on the problem of best (in expectation) arm identification in unstructured muti-armed bandits. This problem has two approaches with very different levels of understanding.

The fixed-confidence framework is the best understood: asymptotically optimal strategies are known, and we are interested in obtaining non-asymptotic guarantees for (if possible) simple and natural strategies. Working with Gaussian bandits, we propose a finite risk analysis of a new (asymptotically optimal) strategy using the regularity properties of this model. This strategy slightly modifies the sampling rule of the Track-and-Stop algorithm into a more conservative and interpretable rule. In the more general context of an exponential model, we propose a preliminary analysis of the asymptotic optimality of adaptive Top-Two algorithms, whose sampling rules are particularly simple.

Independently, in the fixed-budget framework, for which the existence of a hypothetical complexity remains to be demonstrated, we propose generalizations to non-parametric models of the existing bounds (upper and lower) that were available so far only for very specific models. The obtained bounds involve more precise information-theoretic quantities than the gaps (differences between the means) which appeared previously. These new quantities could be the key to measuring the complexity of fixed-budget best-arm identification.

Antoine BARRIER
Antoine BARRIER
PostDoc in Medical Imaging

I’m interested in Medical Imaging techniques and in Optimization Algorithms in Sequential Learning.