Wed, Aug 31, 2022 @ 03:00 PM - 04:30 PM
Thomas Lord Department of Computer Science
PhD Candidate: Chung-Wei Lee
Title: Online Learning and Its Applications to Games and Partially Observable Systems
Committee: Haipeng Luo (host), David Kempe, Ashutosh Nayyar, Vatsal Sharan, Jiapeng Zhang
Abstract: Online Learning is a general framework for studying sequential decision-making. I will start with its applications in solving games. In particular, Online Learning has been shown as an essential theoretical foundation when building superhuman AI in poker games. We first focus on last-iterate convergence, a favorable property for online learning algorithms in two-player zero-sum games. In normal-form games, we show optimistic multiplicative weight updates (OMWU) and optimistic gradient descent ascent (OGDA) enjoy last-iterate convergence. We then generalize the results to extensive-form games (EFGs), which model sequential actions and incomplete information that appear in card games. We show that a family of regret minimization algorithms have last-iterate convergence, with some of them based on OMWU and OGDA can even converge exponentially fast. We then consider multiplayer games, where our goal becomes minimizing the individual regret of every player. We design two algorithms achieving logarithmic regret in EFGs based on ideas including a reduction from normal-form games and usage of a self-concordant regularizer on a lifted space.
In addition to solving EFGs as an application of Online Learning to partially observable systems, we discuss other examples, including dynamic pricing and recommender systems. Specifically, we formulate the problems as bandits with graph feedback and preference elicitation and discuss our contributions therein. Finally, I will talk about future work in all directions.
Audiences: Everyone Is Invited
Contact: Lizsl De Leon