-
CS Colloq: Random Projection Trees and Low Dimensional Manifolds
Tue, Apr 29, 2008 @ 03:30 PM - 05:00 PM
Thomas Lord Department of Computer Science
Conferences, Lectures, & Seminars
Title: Random Projection Trees and Low Dimensional ManifoldsSpeaker: Prof. Sanjoy Dasgupta (UCSD)Abstract:
The curse of dimensionality has traditionally been the bane of nonparametric statistics (histograms, kernel density estimation, nearest neighbor search, and so on), as reflected in running times and convergence rates that are exponentially bad in the dimension. This problem is all the more pressing as data sets get increasingly high dimensional. Recently the field has been rejuvenated in several ways, of which perhaps the most promising is the realization that a lot of real-world data which appears high-dimensional in fact has low "intrinsic" dimension in the sense of lying close to a low-dimensional manifold. In the past few years, there has been a huge interest in learning such manifolds from data, and then using the learned structure to transform the data into a lower-dimensional space where standard statistical methods generically work better. I'll exhibit a way to benefit from intrinsic low dimensionality without having to go to the trouble of explicitly learning its fine structure. Specifically, I'll present a simple variant of the ubiquitous k-d tree (a spatial data structure widely used in machine learning and statistics) that is provably adaptive to low dimensional structure. We call this a "random projection tree" (RP tree). Along the way, I'll discuss different notions of intrinsic dimension -- motivated by manifolds, by local statistics, and by analysis on metric spaces -- and relate them. I'll then prove that RP trees require resources that depend only on these dimensions rather than the dimension of the space in which the data happens to be situated. This is work with Yoav Freund (UC San Diego).Location: Seaver Science Library (SSL) - 150
Audiences: Everyone Is Invited
Contact: CS Colloquia