BEGIN:VCALENDAR METHOD:PUBLISH PRODID:-//Apple Computer\, Inc//iCal 1.0//EN X-WR-CALNAME;VALUE=TEXT:USC VERSION:2.0 BEGIN:VEVENT DESCRIPTION:Speaker: Sham Kakade, University of Washington Talk Title: Sub-Linear Reinforcement Learning Series: Computer Science Distinguished Lecture Series Abstract: Suppose an agent is an unknown environment and seeks to maximize his/her long term future reward. We consider the basic question: does the agent need to learn an accurate model of the environment before he/she can start executing a near-optimal long term course of actions? \n \n Specifically, this talk will consider the problem of provably optimal reinforcement learning for (episodic) finite horizon MDPs, i.e., how an agent learns to maximize his/her (long term) reward in an uncertain environment. The talk will present a novel algorithm, the Variance-reduced Upper Confidence Q-learning (vUCQ), which is the first algorithm which enjoys a regret bound that is both sub-linear in the model size and that achieves optimal minimax regret. The algorithm is sub-linear in that the time to achieve epsilon average regret is a number of samples that is far less than that required to learn any (non-trivial) estimate of the underlying model of the environment. The importance of sub-linear algorithms is largely the motivation for algorithms such as "Q-learning" and other "model-free" approaches. \n \n vUCQ is a successive refinement method in which the algorithm reduces the variance in the "Q-value" estimates and couples this estimation scheme with an upper confidence based algorithm. Technically, this coupling of these techniques is what leads to the algorithm's strong guarantees, showing that "model-free" approaches can be optimal.\n \n This lecture satisfies requirements for CSCI 591: Research Colloquium. \n Biography: Sham Kakade is a Washington Research Foundation Data Science Chair, with a joint appointment in the Department of Statistics and the Department of Computer Science at the University of Washington. \n \n From 2011-2015, I was a principal research scientist at Microsoft Research, New England. From 2010-2012, I was an associate professor at the Department of Statistics, Wharton, University of Pennsylvania. From 2005-2009, I was an assistant professor at the Toyota Technological Institute at Chicago. \n \n I completed my PhD at the Gatsby Computational Neuroscience Unit under the supervision of Peter Dayan, and I was an undergraduate at Caltech where I obtained my BS in physics. I was a postdoc in the Computer and Information Science department at the University of Pennsylvania under the supervision of Michael Kearns.\n Host: Computer Science Department SEQUENCE:5 DTSTART:20180320T160000 LOCATION:SAL 101 DTSTAMP:20180320T160000 SUMMARY:CS Distinguished Lecture: Sham Kakade (University of Washington) – Sub-Linear Reinforcement Learning UID:EC9439B1-FF65-11D6-9973-003065F99D04 DTEND:20180320T172000 END:VEVENT END:VCALENDAR