Fri, Apr 20, 2018 @ 03:00 PM - 04:00 PM
Information Sciences Institute
Conferences, Lectures, & Seminars
Speaker: Mark Yatskar , AI2
Talk Title: Language as a Scaffold for Visual Recognition
Series: Natural Language Seminar
Abstract: In this talk we propose to use natural language as a guide for what people can perceive about the world from images and what ultimately machines should aim to see as well. We discuss two recent structured prediction efforts in this vein: scene graph parsing in Visual Genome, a framework derived from captions, and visual semantic role labeling in imSitu, a formalism built on FrameNet and WordNet. In scene graph parsing, we examine the problem of modeling higher order repeating structure motifs and present new state of the art baselines and methods. We then look at the problem semantic sparsity in visual semantic role labeling: infrequent combinations of output semantics are frequent. We present new compositional and data-augmentation methods for dealing with this challenge, significantly improving on prior work.
Biography: Mark Yatskar is a post-doc at the Allen Institute for Artificial Intelligence and recipient of their Young Investigator Award. His primary research is in the intersection of language and vision, natural language generation, and ethical computing. He received his Ph.D. from the University of Washington with Luke Zettlemoyer and Ali Farhadi and in 2016 received the EMNLP best paper award and his work has been featured in Wired and the New York Times.
Host: Nanyun Peng
More Info: http://nlg.isi.edu/nl-seminar/
Audiences: Everyone Is Invited
Contact: Peter Zamar