Economics and Computation Series

Strategically Efficient Exploration for Multi-Agent Reinforcement Learning

1^st December 2021, 13:00
Robert Loftin
TU Delft

Abstract

As a basis for exploration, the principle of optimism under uncertainty has lead to a number of important theoretical and empirical results in sample efficient reinforcement learning. In this talk, we discuss the role of optimistic exploration in multi-agent reinforcement learning and address potential issues that arise when applying optimism to RL in zero-sum games. We show that the direct application of optimism can lead to highly inefficient exploration in such games, where "cooperative" exploration focuses on outcomes that are unrealistic in "competitive" play. We then introduce a notion of "strategically efficient" exploration and demonstrate theoretically and empirically that strategically efficient learning algorithms can significantly outperform their optimistic counterparts, while retaining the same worst-case sample complexity guarantees.

Additional Materials

https://arxiv.org/pdf/2107.14698.pdf

Why choose Liverpool?
It’s an important question. And there are thousands of answers.

Study with Liverpool

Find a course

Search courses

Select type

Our courses

About studying with us

Research with real world impact

Our research

Research

Postgraduate Research

Research and business collaboration

Advancing knowledge to transform lives

About us

Our story

Key information

Our locations

Search