Logo: University of Southern California

Events Calendar


  • PhD Defense - Shuyang Gao

    Mon, May 07, 2018 @ 12:00 PM - 02:00 PM

    Computer Science

    University Calendar




    Title: Mutual Information Estimation and Its Applications to Machine Learning

    PhD Candidate: Shuyang Gao

    Date: May 7

    Time: 12pm

    Location: SOS B37


    Committee: Aram Galstyan, Greg Verg Steeg, Ilias Diakonikolas, Aiichiro Nakano, Roger Ghanem

    Abstract:
    Mutual information (MI) has been successfully applied to a wide variety of domains due to its remarkable property to measure dependencies between random variables. Despite its popularity and wide spread usage, a common unavoidable problem of mutual information is its estimation. In this thesis, we demonstrate that a popular class of nonparametric MI estimators based on k-nearest-neighbor graphs requires number of samples that scales exponentially with the true MI. Consequently, accurate estimation of MI between strongly dependent variables is possible only for prohibitively large sample size. This important yet overlooked shortcoming of the existing estimators is due to their implicit reliance on local uniformity of the underlying joint distribution. As a result, my thesis proposes two new estimation strategies to address this issue. The new estimators are robust to local non-uniformity, works well with limited data, and is able to capture relationship strengths over many orders of magnitude than the existing k-nearest-neighbor methods.

    Modern data mining and machine learning presents us with problems which may contain thousands of variables and we need to identify only the most promising strong relationships. Therefore, caution must be taken when applying mutual information to such real-world scenarios. By taking these concerns into account, my thesis then demonstrates the practical applicability of mutual information on several tasks, and our contributions include
    i) an information-theoretic framework for measuring stylistic coordination in dialogues. The proposed measure has a simple predictive interpretation and can account for various confounding factors through proper conditioning ii) an new algorithm for mutual information-based feature selection in supervised learning setting iii) an information-theoretic framework for learning disentangled and interpretable representations in unsupervised setting using deep neural networks. For the latter two tasks, we propose to use a variational lower bound for efficient estimation and optimization of mutual information. And for the last task, we have also made a substantial connection of the learning objective with variational auto-encoders (VAE).

    Location: Social Sciences Building (SOS) - B37

    Audiences: Everyone Is Invited

    Contact: Lizsl De Leon

    OutlookiCal

Return to Calendar