muti-armed bandits

Dynamic Learning Rate for Deep Reinforcement Learning: A Bandit Approach

In Deep Reinforcement Learning models trained using gradient-based techniques, the choice of optimizer and its learning rate are …

Henrique DONANCIO, Antoine BARRIER, Leah SOUTH, Florence FORBES

17 octobre 2024