Research

This page contains information about my actual and old research projects.

Publications

Henrique DONANCIO, Antoine BARRIER, Leah SOUTH, Florence FORBES (2024). Dynamic Learning Rate for Deep Reinforcement Learning: A Bandit Approach. Preprint.

PDF arXiv BibTeX

Antoine BARRIER, Thomas COUDERT, Aurélien DELPHIN, Benjamin LEMASSON, Thomas CHRISTEN (2024). MARVEL: MR Fingerprinting with Additional micRoVascular Estimates using bidirectional LSTMs. In MICCAI 2024.

PDF Poster HAL arXiv Code BibTeX

Antoine BARRIER (2023). Contributions to a Theory of Pure Exploration in Sequential Statistics. PhD thesis.

PDF Slides Thèse HAL BibTeX

Antoine BARRIER, Aurélien GARIVIER, Gilles STOLTZ (2023). On Best-Arm Identification with a Fixed Budget in Non-Parametric Multi-Armed Bandits. In ALT 2023.

PDF Slides PMLR HAL arXiv BibTeX

Antoine BARRIER, Aurélien GARIVIER, Tomáš KOCÁK (2022). A Non-Asymptotic Approach to Best-Arm Identification for Gaussian Bandits. In AIStats 2022.

PDF Poster Slides Video PMLR HAL arXiv BibTeX

Postdoc

Since September 2023, I am a post-doc at the Grenoble Institute of Neuroscience.

I am working on a new MRI acquisition technique called MR Fingerprinting. This technique makes it possible to estimate several physiological parameters from a single MRI acquisition, whereas conventional examinations require one acquisition for each parameter.

The MRF method could thus make it possible to considerably reduce the time required for MRI examinations, which in turn would increase the rate of examinations and make MRI more useful in emergency situations such as stroke.

Thesis

My thesis was carried out from September 2020 to August 2023. I defended it on Thursday, July 20, 2023 at ÉNS Lyon. The manuscript is available here and the slides here.

The context of bandit problems is the following: consider $K$ distincts probability distributions $ν_{1}, \dots, ν_{K}$ . Those distributions are unknown but at each step you are able to select an arm $1 \leq k \leq K$ and obtain the value of an independent realization of $ν_{k}$ . You can define the strategy you want (that is to say choose the next arm to observe by using all the previous observations).

There are several mathematical objectives. For instance, in Best Arm Identification, the goal is to identify the best arm, which is the arm with highest associated expectation. There are two settings:

in the Fixed Confidence setting, you have a confidence level $δ \in] 0, 1 [$ and you need to find a strategy that identify the best arm with probability at least $1 - δ$ . The objective is then to minimize the expectation of the number of observations required by the strategy.
in the Fixed Budget setting, you are given a fixed number of observations $n \in N^{*}$ and you have to find a strategy that maximizes the probability of returning the best arm after those observations.

For more information about bandit problems the book of Tor Lattimore and Csaba Szepesvári is a good introduction.

Research Internships

HPC resource management improvement using Reinforcement Learning

I used Reinforcement Learning to deal with the problem of resource allocation into HPC clusters during a 4-months internhsip

Random Hyperbolic Graphs

I studied propagation models into Random Hyperbolic Graphs during a 4-months internship

Bundle Adjustment with Known Positions

I studied Bundle Adjustment properties of satellite images during a 3-months internship