Thu, Oct 10, 2019 @ 12:15 PM - 02:00 PM
Thomas Lord Department of Computer Science
Conferences, Lectures, & Seminars
Speaker: Grigory Yaroslavtsev, Assistant Professor of Statistics at Indiana University
Talk Title: Advances in Hierarchical Clustering of Vector Data
Abstract: Compared to the highly successful flat clustering (e.g. k-means), despite its important role and applications in data analysis, hierarchical clustering has been lacking in rigorous algorithmic studies until late due to absence of rigorous objectives. Since 2016, a sequence of works has emerged and gave novel algorithms for this problem in the general metric setting. This was enabled by a breakthrough by Dasgupta, who introduced a formal objective into the study of hierarchical clustering.
In this talk I will give an overview of our recent progress on models and scalable algorithms for hierarchical clustering applicable specifically to high-dimensional vector data, including embedding vectors arising from deep learning. I will first discuss various linkage-based algorithms (single-linkage, average-linkage) and their formal properties with respect to various objectives. I will then introduce a new projection-based approximation algorithm for vector data. The talk will be self-contained and does not assume prior knowledge of clustering methods.
Host: Shaddin Dughmi
Audiences: Everyone Is Invited
Contact: Cherie Carter