-
PhD Defense - Chen-Yu Wei
Fri, Jun 03, 2022 @ 03:00 PM - 05:00 PM
Thomas Lord Department of Computer Science
University Calendar
PhD Candidate: Chen-Yu Wei
Title: Robust and adaptive online decision making
Committee members: Haipeng Luo (host), David Kempe, Rahul Jain, Jaipeng Zhang
Time: 3pm - 5pm, June 3 (Friday)
Zoom link: https://usc.zoom.us/j/96811461450
Abstract:
Online learning (or online decision making) is a learning paradigm that involves real-time interactions between the learner and the environment. The learner has to make real-time decisions based on past data, and the learner's decision may further affect the data distribution in the future. This is more challenging than the traditional machine learning framework where the data is i.i.d. and the learner's decisions do not affect data distribution.
Because the learner's decisions are involved in the data collection process, an important general question is "how to efficiently explore the world in order to learn a good policy?" Past research has developed algorithms that can perform strategic exploration, and achieve near-optimal performance in the most difficult environment. However, this worst-case view is too pessimistic since there are usually some benign properties of the environment that the learner can take advantage of. Thus, a natural question is "how to design algorithms that can take advantage of the easiness of the environment?" We answer this question by developing algorithms whose performance can adapt to the easiness of the environment for several canonical online learning settings.
Since online learning is interactive, an adversary that exists in the environment may exploit the learner's algorithm, corrupt the data, and make the learner fail to learn good policies. If an algorithm totally fails only with a small amount of corruption, then the algorithm might be too unsafe to be deployed in practice. Therefore, we would like to have robust algorithms that can tolerate as much corruption as possible. We achieve the goal by developing algorithms whose performance scales optimally against the amount of corruption.
With adaptivity and robustness, an online learning algorithm will be able to more efficiently and more safely used in a wide spectrum of environments, without the learner having prior knowledge about the environment. We hope that the algorithmic techniques and insight developed in this thesis could be useful in improving existing algorithms for real applications.
WebCast Link: https://usc.zoom.us/j/96811461450
Audiences: Everyone Is Invited
Contact: Lizsl De Leon