USC - Viterbi School of Engineering

Nov
30

PhD Thesis Proposal - Zhaoheng Zheng
Wed, Nov 30, 2022 @ 08:30 AM - 10:00 AM
Thomas Lord Department of Computer Science
University Calendar

Ph.D. Candidate: Zhaoheng Zheng

Topic: Incorporating Large-Scale Vision-Language Corpora in Visual Understanding

Committee Chair: Prof. Ram Nevatia
Committee Member: Prof. Keith Jenkins
Committee Member: Prof. Jesse Thomason
Committee Member: Prof. Greg Ver Steeg
Committee Member: Prof. Mohammad Soleymani

Abstract: Vision and language are key mediators through which humans interact with the external world or other members of society. One goal of artificial intelligence (AI) research is to create machines that can perceive the real world through multiple modalities. Previous research has shown remarkable progress in creating functional visual or linguistic perception systems with the help of deep neural networks. Recently, thanks to the advances of the Internet and social media, large-scale vision-language corpora can be easily accessed, motivating research that aims at creating large-scale Vision-Language Pre-training (VLP) models. Compared with previous methods, VLP models are stronger and more generalizable thanks to their data scale. In this thesis, we investigate the problem of how to leverage such data to boost existing visual understanding tasks. Particularly in FashionVLP, we propose to fine-tune a pre-trained VLP model for fashion image retrieval. More specifically, we fine-tune the model with customized input sequences containing various vision-language features, achieving significant improvements on multiple benchmarks. Moreover, we take a step further and explore better designs for VLP models to learn from large-scale corpora, resulting in our recent work, Fractional Intermediate Tower (FIT). FIT enhances the vision-language fusion process inside VLP models by encoding vision features from multiple vision layers before they are taken by the fusion encoder.

WebCast Link: https://usc.zoom.us/j/95655803815?pwd=d3RrOXNrU2dVVE1sTkZpYXU3NWxEUT09
Audiences: Everyone Is Invited

Contact: Lizsl De Leon

This event is open to all eligible individuals. USC Viterbi operates all of its activities consistent with the University's Notice of Non-Discrimination. Eligibility is not determined based on race, sex, ethnicity, sexual orientation, or any other prohibited factor.
Add to Google Calendar

Return to Calendar

Events Calendar

PhD Thesis Proposal - Zhaoheng Zheng