Data Mining and Machine Learning Series

Bayesian Reinforcement Learning for Problems with State Uncertainty

8th September 2017, 13:00 add to calender
Frans Oliehoek
University of Liverpool

Abstract

Sequential decision making under uncertainty is a challenging problem, especially when the decision maker, or agent, has uncertainty about what the true 'state' of the environment is. That is, in many applications the problem is 'partially observable': there are important pieces of information that are fundamentally hidden from the agent. Moreover, the problem gets even more complex when no accurate model of the environment is available. In such cases, the agent will need to update its belief over the environment, i.e., learn, during execution.

In this talk, I will introduce a formal way of modeling decision making under partial observability, as well as a more recent extension to the learning setting. I will explain how the learning problem can be tackled using a method called 'POMCP', and how this can be made more efficient via a number of novel techniques. Time permitting, I will also discuss extensions of this methodology that explicitly deal with coordination with other agents, and anticipation of other actors (such as humans) in the environment.
add to calender (including abstract)