USC - Viterbi School of Engineering

Mar
03

CS Colloquium: Zhuoran Yang (Princeton University) - Demystifying (Deep) Reinforcement Learning: The Pessimist, The Optimist, and Their Provable Efficiency
Wed, Mar 03, 2021 @ 09:00 AM - 10:00 AM
Thomas Lord Department of Computer Science
Conferences, Lectures, & Seminars

Speaker: Zhuoran Yang, Princeton University

Talk Title: Demystifying (Deep) Reinforcement Learning: The Pessimist, The Optimist, and Their Provable Efficiency

Series: CS Colloquium

Abstract: Coupled with powerful function approximators such as deep neural networks, reinforcement learning (RL) achieves tremendous empirical successes. However, its theoretical understandings lag behind. In particular, it remains unclear how to provably attain the optimal policy with a finite regret or sample complexity. In this talk, we will present the two sides of the same coin, which demonstrates an intriguing duality between pessimism and optimism.

- In the offline setting, we aim to learn the optimal policy based on a dataset collected a priori. Due to a lack of active interactions with the environment, we suffer from the insufficient coverage of the dataset. To maximally exploit the dataset, we propose a pessimistic least-squares value iteration algorithm, which achieves a minimax-optimal sample complexity.

- In the online setting, we aim to learn the optimal policy by actively interacting with an environment. To strike a balance between exploration and exploitation, we propose an optimistic least-squares value iteration algorithm, which achieves a \sqrt{T} regret in the presence of linear, kernel, and neural function approximators.

This lecture satisfies requirements for CSCI 591: Research Colloquium.

Biography: Zhuoran Yang is a final-year Ph.D. student in the Department of Operations Research and Financial Engineering at Princeton University, advised by Professor Jianqing Fan and Professor Han Liu. Before attending Princeton, He obtained a Bachelor of Mathematics degree from Tsinghua University. His research interests lie in the interface between machine learning, statistics, and optimization. The primary goal of his research is to design a new generation of machine learning algorithms for large-scale and multi-agent decision-making problems, with both statistical and computational guarantees. Besides, he is also interested in the application of learning-based decision-making algorithms to real-world problems that arise in robotics, personalized medicine, and computational social science.

Host: Haipeng Luo

Audiences: Everyone Is Invited

Contact: Assistant to CS chair

This event is open to all eligible individuals. USC Viterbi operates all of its activities consistent with the University's Notice of Non-Discrimination. Eligibility is not determined based on race, sex, ethnicity, sexual orientation, or any other prohibited factor.
Add to Google Calendar

Return to Calendar

Events Calendar

CS Colloquium: Zhuoran Yang (Princeton University) - Demystifying (Deep) Reinforcement Learning: The Pessimist, The Optimist, and Their Provable Efficiency