BEGIN:VCALENDAR METHOD:PUBLISH PRODID:-//Apple Computer\, Inc//iCal 1.0//EN X-WR-CALNAME;VALUE=TEXT:USC VERSION:2.0 BEGIN:VEVENT DESCRIPTION:Speaker: Jascha Sohl-Dickstein, Kahn Academy, Stanford University Talk Title: Fast large-scale optimization by unifying stochastic gradient and quasi-Newton methods Series: AISeminar Abstract: Abstract:\n I will present an algorithm for performing minibatch optimization that combines the computational efficiency of stochastic gradient descent (SGD) with the second order curvature information leveraged by quasi-Newton methods. These approaches are unified by maintaining an independent Hessian approximation for each minibatch. Each update step requires only a single minibatch evaluation (as in SGD), and each step is scaled using an approximate inverse Hessian and little to no adjustment of hyperparameters is required (as is typical for quasi-Newton methods). This algorithm is made tractable in memory and computational cost even for high dimensional optimization problems by storing and manipulating the quadratic approximations for each minibatch in a shared, time evolving, low dimensional subspace. Experimental results demonstrate improved convergence on seven diverse optimization problems. The algorithm is released as open source Python and MATLAB packages.\n \n Optimizer available at:\n https://github.com/Sohl-Dickstein/Sum-of-Functions-Optimizer\n \n Paper reference:\n Jascha Sohl-Dickstein, Ben Poole, and Surya Ganguli\n Fast large-scale optimization by unifying stochastic gradient and quasi-Newton methods\n International Conference on Machine Learning (2014)\n http://arxiv.org/abs/1311.2115\n \n Biography: Bio:\n Jascha Sohl-Dickstein is an Academic Resident at the Khan Academy, and a visiting scholar in applied physics in Surya Ganguli's lab at Stanford University. He earned his PhD in 2012 in the Redwood Center for Theoretical Neuroscience at UC Berkeley, in Bruno Olshausen's lab. His research interests involve applying ideas from statistical physics and dynamical systems to problems in machine learning and neuroscience. Host: Greg Ver Steeg Webcast: http://webcasterms1.isi.edu/mediasite/Viewer/?peid=925b9f53d4eb4964a37af20bacde2ad31d SEQUENCE:5 DTSTART:20140801T110000 LOCATION:ISI 1135 DTSTAMP:20140801T110000 SUMMARY:AI SEMINAR-Fast large-scale optimization by unifying stochastic gradient and quasi-Newton methods UID:EC9439B1-FF65-11D6-9973-003065F99D04 DTEND:20140801T120000 END:VEVENT END:VCALENDAR