USC - Viterbi School of Engineering

Mar
20

TBA
Tue, Mar 20, 2018 @ 11:00 AM - 12:00 PM
Thomas Lord Department of Computer Science
Conferences, Lectures, & Seminars

Speaker: TBA,

Talk Title: TBA

Series: CS Colloquium

Abstract: TBA

This lecture satisfies requirements for CSCI 591: Research Colloquium. Please note, due to limited capacity, seats will be first come first serve.

Biography: TBA

Host: Muhammad Naveed / David Kempe

Location: Olin Hall of Engineering (OHE) - 100 D
Audiences: Everyone Is Invited

Contact: Assistant to CS chair
Mar
20

Epstein Institute Seminar, ISE 651
Tue, Mar 20, 2018 @ 03:30 PM - 04:50 PM
Daniel J. Epstein Department of Industrial and Systems Engineering
Conferences, Lectures, & Seminars

Speaker: Dr. Natashia Boland, Professor, Georgia Tech

Talk Title: Time Discretization in Integer Programming

Host: Dr. Phebe Vayanos/Prof. Suvrajeet Sen

More Information: March 20, 2018.pdf
Location: Ethel Percy Andrus Gerontology Center (GER) - 206
Audiences: Everyone Is Invited

Contact: Grace Owh
Mar
20

CS Distinguished Lecture: Sham Kakade (University of Washington) – Sub-Linear Reinforcement Learning
Tue, Mar 20, 2018 @ 04:00 PM - 05:20 PM
Thomas Lord Department of Computer Science
Conferences, Lectures, & Seminars

Speaker: Sham Kakade, University of Washington

Talk Title: Sub-Linear Reinforcement Learning

Series: Computer Science Distinguished Lecture Series

Abstract: Suppose an agent is an unknown environment and seeks to maximize his/her long term future reward. We consider the basic question: does the agent need to learn an accurate model of the environment before he/she can start executing a near-optimal long term course of actions?

Specifically, this talk will consider the problem of provably optimal reinforcement learning for (episodic) finite horizon MDPs, i.e., how an agent learns to maximize his/her (long term) reward in an uncertain environment. The talk will present a novel algorithm, the Variance-reduced Upper Confidence Q-learning (vUCQ), which is the first algorithm which enjoys a regret bound that is both sub-linear in the model size and that achieves optimal minimax regret. The algorithm is sub-linear in that the time to achieve epsilon average regret is a number of samples that is far less than that required to learn any (non-trivial) estimate of the underlying model of the environment. The importance of sub-linear algorithms is largely the motivation for algorithms such as "Q-learning" and other "model-free" approaches.

vUCQ is a successive refinement method in which the algorithm reduces the variance in the "Q-value" estimates and couples this estimation scheme with an upper confidence based algorithm. Technically, this coupling of these techniques is what leads to the algorithm's strong guarantees, showing that "model-free" approaches can be optimal.

This lecture satisfies requirements for CSCI 591: Research Colloquium.

Biography: Sham Kakade is a Washington Research Foundation Data Science Chair, with a joint appointment in the Department of Statistics and the Department of Computer Science at the University of Washington.

From 2011-2015, I was a principal research scientist at Microsoft Research, New England. From 2010-2012, I was an associate professor at the Department of Statistics, Wharton, University of Pennsylvania. From 2005-2009, I was an assistant professor at the Toyota Technological Institute at Chicago.

I completed my PhD at the Gatsby Computational Neuroscience Unit under the supervision of Peter Dayan, and I was an undergraduate at Caltech where I obtained my BS in physics. I was a postdoc in the Computer and Information Science department at the University of Pennsylvania under the supervision of Michael Kearns.

Host: Computer Science Department

Location: Henry Salvatori Computer Science Center (SAL) - 101
Audiences: Everyone Is Invited

Contact: Computer Science Department

Events Calendar

Select a calendar:

Filter March Events by Event Type:

Events for March 20, 2018

TBA

Epstein Institute Seminar, ISE 651

CS Distinguished Lecture: Sham Kakade (University of Washington) – Sub-Linear Reinforcement Learning