Recent literature has witnessed much progress on the algorithms and theory of Reinforcement Learning (RL). In this talk, I will discuss some recent work on provably efficient RL for challenging settings with large spaces and multiple strategic agents.
I will first discuss on-policy RL for two-player Markov games, which generalizes both matrix games and Markov decision processes. We consider function approximation to deal with continuous and unbounded state spaces. Based on a fruitful marriage between RL and algorithmic game theory, we develop an efficient exploration mechanism using Coarse Correlated Equilibrium. The resulting algorithm applies to the offline setting where one aims to find the Nash Equilibrium of the Markov game, and the online setting where one plays against an arbitrary opponent and aims to minimize regret. For both settings, we provide sample complexity and regret guarantees.
I will then discuss RL for mean-field game with a very large number of interacting agents. We consider a fictitious play algorithm, which alternatively updates the policy and the mean-field state using one step of policy optimization and gradient descent, respectively. We establish explicit rate for convergence to the Nash equilibrium.
Orchard View Room, Virtual