Fri, Nov 30, 2018 @ 10:00 AM - 12:00 PM
Thomas Lord Department of Computer Science
PhD Defense- Rama Kovvuri
Nov. 30, 2018
Title: Semantic-based visual information retrieval for natural language phrases
Committee: Ramakant Nevatia, Jyotirmoy Deshmukh , Panayiotis Georgiou (external member)
Visual information retrieval is an important task in various fields such as Visual Search, Robotics, Autonomous driving and Robotics. Associating visual entities with natural language is a challenging task given the diversity and ambiguity in both language and vision.
The primary goal of my research is to predict the semantics in visual entities and associate them to natural language phrases. With the advent of deep learning, computer vision systems are able to achieve high accuracy in basic tasks such as image classification and detection. I aim to leverage these advancements to build higher-level vision-language systems that can associate arbitrary queries to visual entities.
My PhD work mainly focuses on the following three aspects towards this goal: (1) Generate diverse object proposals from visual entities; (2) Learn rich semantics from these generated proposals; (3) Associate natural language phrases to these proposals. In the defense talk, I will introduce my recent work on supervised phrase grounding and weakly-supervise phrase grounding. I will also provide a brief overview of the history and state-of-the-art methods for phrase grounding.
Audiences: Everyone Is Invited
Contact: Lizsl De Leon