Logo: University of Southern California

Events Calendar

  • PhD Thesis Proposal - Zhaoheng Zheng

    Wed, Nov 30, 2022 @ 08:30 AM - 10:00 AM

    Thomas Lord Department of Computer Science

    University Calendar

    Ph.D. Candidate: Zhaoheng Zheng

    Topic: Incorporating Large-Scale Vision-Language Corpora in Visual Understanding

    Committee Chair: Prof. Ram Nevatia
    Committee Member: Prof. Keith Jenkins
    Committee Member: Prof. Jesse Thomason
    Committee Member: Prof. Greg Ver Steeg
    Committee Member: Prof. Mohammad Soleymani

    Abstract: Vision and language are key mediators through which humans interact with the external world or other members of society. One goal of artificial intelligence (AI) research is to create machines that can perceive the real world through multiple modalities. Previous research has shown remarkable progress in creating functional visual or linguistic perception systems with the help of deep neural networks. Recently, thanks to the advances of the Internet and social media, large-scale vision-language corpora can be easily accessed, motivating research that aims at creating large-scale Vision-Language Pre-training (VLP) models. Compared with previous methods, VLP models are stronger and more generalizable thanks to their data scale. In this thesis, we investigate the problem of how to leverage such data to boost existing visual understanding tasks. Particularly in FashionVLP, we propose to fine-tune a pre-trained VLP model for fashion image retrieval. More specifically, we fine-tune the model with customized input sequences containing various vision-language features, achieving significant improvements on multiple benchmarks. Moreover, we take a step further and explore better designs for VLP models to learn from large-scale corpora, resulting in our recent work, Fractional Intermediate Tower (FIT). FIT enhances the vision-language fusion process inside VLP models by encoding vision features from multiple vision layers before they are taken by the fusion encoder.

    WebCast Link: https://usc.zoom.us/j/95655803815?pwd=d3RrOXNrU2dVVE1sTkZpYXU3NWxEUT09

    Audiences: Everyone Is Invited

    Contact: Lizsl De Leon


Return to Calendar