Thompson sampling for m-top exploration
We introduce Boundary Focused Thompson sampling (BFTS), a new Bayesian algorithm to solve the anytime m-top exploration problem, where the objective is to identify the m best arms in a multi-armed bandit. We consider a set of existing benchmark problems and introduce two new environments inspired by real world decision problems, for which we experimentally show that BFTS consistently outperforms AT-LUCB, the current state of the art algorithm.
I am a PhD student at the department of computer science of the Vrije Universiteit Brussel, in Brussels, Belgium. I investigate the use and development of new machine learning techniques to support decision making in the context of epidemic mitigation. My main fields of interest include Bayesian multi-armed-bandits, reinforcement learning, epidemiological modelling and computational biology.