USC - Viterbi School of Engineering

Aug
01

AI SEMINAR-Fast large-scale optimization by unifying stochastic gradient and quasi-Newton methods
Fri, Aug 01, 2014 @ 11:00 AM - 12:00 PM
Information Sciences Institute
Conferences, Lectures, & Seminars

Speaker: Jascha Sohl-Dickstein, Kahn Academy, Stanford University

Talk Title: Fast large-scale optimization by unifying stochastic gradient and quasi-Newton methods

Series: AISeminar

Abstract: Abstract:
I will present an algorithm for performing minibatch optimization that combines the computational efficiency of stochastic gradient descent (SGD) with the second order curvature information leveraged by quasi-Newton methods. These approaches are unified by maintaining an independent Hessian approximation for each minibatch. Each update step requires only a single minibatch evaluation (as in SGD), and each step is scaled using an approximate inverse Hessian and little to no adjustment of hyperparameters is required (as is typical for quasi-Newton methods). This algorithm is made tractable in memory and computational cost even for high dimensional optimization problems by storing and manipulating the quadratic approximations for each minibatch in a shared, time evolving, low dimensional subspace. Experimental results demonstrate improved convergence on seven diverse optimization problems. The algorithm is released as open source Python and MATLAB packages.

Optimizer available at:
https://github.com/Sohl-Dickstein/Sum-of-Functions-Optimizer

Paper reference:
Jascha Sohl-Dickstein, Ben Poole, and Surya Ganguli
Fast large-scale optimization by unifying stochastic gradient and quasi-Newton methods
International Conference on Machine Learning (2014)
http://arxiv.org/abs/1311.2115

Biography: Bio:
Jascha Sohl-Dickstein is an Academic Resident at the Khan Academy, and a visiting scholar in applied physics in Surya Ganguli's lab at Stanford University. He earned his PhD in 2012 in the Redwood Center for Theoretical Neuroscience at UC Berkeley, in Bruno Olshausen's lab. His research interests involve applying ideas from statistical physics and dynamical systems to problems in machine learning and neuroscience.

Host: Greg Ver Steeg

Webcast: http://webcasterms1.isi.edu/mediasite/Viewer/?peid=925b9f53d4eb4964a37af20bacde2ad31d
Location: Information Science Institute (ISI) - 1135
WebCast Link: http://webcasterms1.isi.edu/mediasite/Viewer/?peid=925b9f53d4eb4964a37af20bacde2ad31d
Audiences: Everyone Is Invited

Contact: Alma Nava / Information Sciences Institute

This event is open to all eligible individuals. USC Viterbi operates all of its activities consistent with the University's Notice of Non-Discrimination. Eligibility is not determined based on race, sex, ethnicity, sexual orientation, or any other prohibited factor.
Add to Google Calendar

Return to Calendar

Events Calendar

AI SEMINAR-Fast large-scale optimization by unifying stochastic gradient and quasi-Newton methods