BEGIN:VCALENDAR
METHOD:PUBLISH
PRODID:-//Apple Computer\, Inc//iCal 1.0//EN
X-WR-CALNAME;VALUE=TEXT:USC
VERSION:2.0
BEGIN:VEVENT
DESCRIPTION:Time: 2:00-3:00pm Feb 25, Friday\n
\n
Committee: Haipeng Luo (chair), Rahul Jain, David Kempe, Ashutosh Nayyar, Vatsal Sharan.\n
\n
Title: Online Goal-Oriented Reinforcement Learning\n
\n
Abstract: Reinforcement Learning (RL) studies how an agent learns to behave optimally in an unknown environment. It has been a popular topic in both industries and academia since AlphaGo demonstrated its great potential. However, there is still a large gap between theory and practice of RL due to the strong assumptions made in theoretical RL. My research focuses on online learning in a goal-oriented Markov Decision Process model named Stochastic Shortest Path (SSP), where the learner's objective is to reach a goal state with the smallest possible cost. Many real applications can be modeled by SSP such as games, car navigation, and robotic manipulations. To understand the SSP model better, we first focus on establishing minimax regret bounds in various settings. Specifically, for SSP with stochastic costs, we develop a simple minimax optimal algorithm concurrent to other works; for SSP with adversarial costs, we develop efficient minimax optimal algorithms with known transition, and near-optimal algorithms with unknown transition. Next, we focus on developing practical learning algorithms for SSP from different perspectives. Specifically, we develop the first model-free algorithm, the first set of policy optimization algorithms, and improved algorithms with linear function approximation.\n
\n
For future work, I plan to study SSP for more general settings and develop more practical algorithms. For example, I plan to study the non-stationary SSP where both the transition and cost functions are changing, and SSP under general function approximation. I also plan to develop parameter-free SSP algorithms under different settings.
SEQUENCE:5
DTSTART:20220225T140000
LOCATION:
DTSTAMP:20220225T140000
SUMMARY:PhD Thesis Proposal - Liyu Chen
UID:EC9439B1-FF65-11D6-9973-003065F99D04
DTEND:20220225T150000
END:VEVENT
END:VCALENDAR