Logo: University of Southern California

Events Calendar


  • PhD Dissertation Defense - Arka Sadhu

    Tue, Apr 23, 2024 @ 02:00 PM - 03:30 PM

    Thomas Lord Department of Computer Science

    University Calendar


    Title: Grounding Language in Images and Videos  
     
    Location: SAL 213  
     
    Time: 2 pm on April 23, 2024  
     
    Committee Members: Ram Nevatia (Chair), Xiang Ren, Toby Mintz  
     
    Abstract: My thesis investigates the problem of grounding language in images and videos -- the task of associating linguistic symbols to perceptual experiences and actions -- which is fundamental to developing multi-modal models that can understand and jointly reason over images, videos, and text. The overarching goal of my dissertation is to bridge the gap between language and vision as a means to a ``deeper understanding'' of images and videos to allow developing models capable of reasoning over longer-time horizons such as hour-long movies, or a collection of images, or even multiple videos. In this thesis, I will introduce the various vision-language tasks developed during my Ph.D. which include grounding unseen words, spatiotemporal localization of entities in a video, video question-answering, and visual semantic role labeling in videos, reasoning across more than one image or a video, and finally, weakly-supervised open-vocabulary object detection. For each of these tasks, I will further discuss the development of corresponding datasets, evaluation protocols, and model frameworks. These tasks aim to investigate a particular phenomenon inherent in image or video understanding in isolation, develop corresponding datasets and model frameworks, and outline evaluation protocols robust to data priors.  
     
    The resulting models can be used for other downstream tasks like obtaining common-sense knowledge graphs from instructional videos or drive end-user applications like Retrieval, Question Answering, and Captioning.  
     
    Zoom Link: https://usc.zoom.us/j/94652316277?pwd=QTdqcklJMjg2UE03ZVZHbmFvWU9nQT09    

    Location: Henry Salvatori Computer Science Center (SAL) - 213

    Audiences: Everyone Is Invited

    Contact: Arka Sadhu

    OutlookiCal

Return to Calendar