-
CS Colloquium: Lihong Li (Microsoft Research) - Taming the Monster: Provably Efficient Algorithms for Contextual Bandits with General Policy Classes
Thu, Oct 22, 2015 @ 04:00 PM - 05:00 PM
Thomas Lord Department of Computer Science
Conferences, Lectures, & Seminars
Speaker: Lihong Li, Microsoft Research
Talk Title: Provably Efficient Algorithms for Contextual Bandits with General Policy Classes
Series: CS Colloquium
Abstract: This lecture satisfies requirements for CSCI 591: Computer Science Research Colloquium
We consider contextual bandit problems, where in each round the learner takes one of K actions in response to the observed context, and observes the reward only for that chosen action. In the first part of the talk, we focus on the standard setting, where the challenge is to efficiently balance exploration/exploitation to maximize total rewards (equivalently, minimize total regret) in T rounds, a problem commonly encountered in many important interaction problems like advertising and recommendation. Our algorithm assumes access to an oracle for solving a form of classification problems and achieves the statistically optimal regret guarantee with a small number of oracle calls across T rounds. The resulting algorithm is the most practical one amongst contextual-bandit algorithms that work for general policy classes. In the second part of the talk, we show how the above general algorithmic idea can be adapted to contextual bandits with global convex constraints and concave objective functions, a setting that is substantially harder and is important in many applications. Joint work with Alekh Agarwal, Shipra Agrawal, Nikhil R. Devanur, Daniel Hsu, Satyen Kale, John Langford, and Robert E. Schapire.
This lecture will be available to stream HERE.
Biography: Lihong Li is a Researcher in the Machine Learning Department at Microsoft Research-Redmond. Prior to joining Microsoft, he was a Research Scientist in the Machine Learning Group at Yahoo! Research in Silicon Valley. He obtained a PhD degree from Rutgers University in Computer Science. His main research interests are machine learning with interaction, including reinforcement learning, multi-armed bandits, online learning, and their applications especially those on the Internet like recommender systems, search, and advertising. He has served as area chair or senior program committee member at ICML, NIPS, and IJCAI.
Host: Yan Liu
Webcast: https://bluejeans.com/350859861Location: Henry Salvatori Computer Science Center (SAL) - 101
WebCast Link: https://bluejeans.com/350859861
Audiences: Everyone Is Invited
Contact: Assistant to CS chair