USC - Viterbi School of Engineering

Jul
20

NL Seminar-Visual Question Answering the Good, the Bad, and the Ugly
Fri, Jul 20, 2018 @ 03:00 PM - 04:00 PM
Information Sciences Institute
Conferences, Lectures, & Seminars

Speaker: Wei-Lun Harry Chao , USC

Talk Title: Visual Question Answering: the Good, the Bad, and the Ugly

Series: Natural Language Seminar

Abstract: Visual question answering Visual QA requires comprehending and reasoning with both visual and language information, a characteristic ability that AI should strive to achieve. Merely in the past three years, over a dozen datasets have been released, together with many learning based models that have been narrowing the gap between the humans performance and the machines. On one popular dataset VQA, the state of the art model achieves 71.4 percent accuracy, just percent shy of that by humans.

While seemingly remarkable, it needs a deeper investigation on what knowledge the machine actually learns does it understand the multi modal information? Or it relies on and over fits to the incidental dataset statistics. Moreover, current experimental setups mainly focus on training and testing within the same dataset. It is unclear how the learned model can be applied to the real environment where both the visual and language data might have mismatch.

In this talk, I will present our recent studies to answer these questions. We show that the dataset design has a significant impact on what a model learns. Specifically, the resulting model can ignore the visual information, the question, or both while still doing well on the task. We thus propose automatic procedures to remedy such design deficiencies. We then show that the mismatch in language hinders transferring a learned model across datasets. To this end, we develop a domain adaptation algorithm for Visual QA to facilitate knowledge transfer. Finally, I will present a probabilistic framework of Visual QA algorithms to effectively leverage the answer semantics, drastically increasing the transferability. I will conclude the talk with future directions to advance Visual QA.

Biography: Wei Lun Harry Chao is a Computer Science PhD candidate at University of Southern California, working with Fei Sha. His research interests are in machine learning and its applications to computer vision, artificial intelligence, and health care. His recent work has focused on transfer learning toward vision and language understanding in the wild. His earlier research includes work on probabilistic inference, structured prediction for video summarization, and face understanding. He will be joining The Ohio State University as an assistant professor in 2019 Fall, following a one-year postdoc at Cornell University.

Host: Nanyun Peng

More Info: http://nlg.isi.edu/nl-seminar/

Location: 11th Flr Conf Rm # 1135, Marina Del Rey
Audiences: Everyone Is Invited

Contact: Peter Zamar

Event Link: http://nlg.isi.edu/nl-seminar/

This event is open to all eligible individuals. USC Viterbi operates all of its activities consistent with the University's Notice of Non-Discrimination. Eligibility is not determined based on race, sex, ethnicity, sexual orientation, or any other prohibited factor.
Add to Google Calendar

Return to Calendar

Events Calendar

NL Seminar-Visual Question Answering the Good, the Bad, and the Ugly