-
Phd Defense - Bowen Zhang
Mon, May 02, 2022 @ 09:00 AM - 10:30 AM
Thomas Lord Department of Computer Science
University Calendar
PhD Candidate: Bowen Zhang
Committee chair: Prof. Leana Golubchik (CS dept.), Prof. Fei Sha,
Committee members: Prof. Laurent Itti (CS dept.), Prof. Shri Narayanan (EE dept.)
May. 2 Monday 9:00am-10:30am
Title: Visual Representation Learning with Structural Prior
Abstract: Visual representation learning is crucial for building a robust and effective visual understanding system. The goal is to build general-purpose representations to benefit multiple downstream tasks (\ie image/video classification, segmentation, retrieval, etc.) With the accessibility to large-scale datasets and the advance in complex learning methods, sophisticated neural architectures and novel training approaches have been proposed to improve visual representation. However, obtaining a versatile representation is still yet an open question. This thesis aims to leverage the visual structure to obtain more general visual representations. The key observation is that the visual components (\ie images and videos) contain structure. It can be decomposed into atomic components such as objects, attributes, clips, etc. For example, images can be decomposed into objects and can be further described by attributes. Similarly, videos can describe complex scenes composed of multiple clips or shots, where each depicts a semantically coherent event or action. As atomic components are shareable across modalities and tasks, we hope the hierarchical visual representation that is compiled from the atomic representation could achieve better generalization ability. In this thesis, we studied two scenarios to obtain the visual structures: the structure from parallel visual and text data and the pure visual domain. We achieved state-of-the-art performance on video and text retrieval, moment localization in a video corpus, image and text retrieval, action recognition, and visual storytelling with the proposed hierarchically visual representation.WebCast Link: https://usc.zoom.us/j/92058237989
Audiences: Everyone Is Invited
Contact: Lizsl De Leon