Logo: University of Southern California

Events Calendar


  • Phd Defense - Bowen Zhang

    Mon, May 02, 2022 @ 09:00 AM - 10:30 AM

    Thomas Lord Department of Computer Science

    University Calendar


    PhD Candidate: Bowen Zhang

    Committee chair: Prof. Leana Golubchik (CS dept.), Prof. Fei Sha,
    Committee members: Prof. Laurent Itti (CS dept.), Prof. Shri Narayanan (EE dept.)

    May. 2 Monday 9:00am-10:30am

    Title: Visual Representation Learning with Structural Prior

    Abstract: Visual representation learning is crucial for building a robust and effective visual understanding system. The goal is to build general-purpose representations to benefit multiple downstream tasks (\ie image/video classification, segmentation, retrieval, etc.) With the accessibility to large-scale datasets and the advance in complex learning methods, sophisticated neural architectures and novel training approaches have been proposed to improve visual representation. However, obtaining a versatile representation is still yet an open question. This thesis aims to leverage the visual structure to obtain more general visual representations. The key observation is that the visual components (\ie images and videos) contain structure. It can be decomposed into atomic components such as objects, attributes, clips, etc. For example, images can be decomposed into objects and can be further described by attributes. Similarly, videos can describe complex scenes composed of multiple clips or shots, where each depicts a semantically coherent event or action. As atomic components are shareable across modalities and tasks, we hope the hierarchical visual representation that is compiled from the atomic representation could achieve better generalization ability. In this thesis, we studied two scenarios to obtain the visual structures: the structure from parallel visual and text data and the pure visual domain. We achieved state-of-the-art performance on video and text retrieval, moment localization in a video corpus, image and text retrieval, action recognition, and visual storytelling with the proposed hierarchically visual representation.

    WebCast Link: https://usc.zoom.us/j/92058237989

    Audiences: Everyone Is Invited

    Contact: Lizsl De Leon

    OutlookiCal

Return to Calendar