Thu, Oct 20, 2022 @ 11:00 AM - 12:00 PM
Information Sciences Institute
Conferences, Lectures, & Seminars
Speaker: Sewon Min, University of Washington
Talk Title: Understanding and Improving Learning through Inference with Large Language Models
Series: NL Seminar
Abstract: THIS TALK WILL NOT BE RECORDED, IT WILL BE BROADCAST LIVE ONLY*
Meeting hosts only admit guests that they know to the Zoom meeting. Hence, you are highly encouraged to use your USC account to sign into Zoom.
If you are an outside visitor, please inform us at nlg DASH seminar DASH host AT isi DOT edu beforehand so we will be aware of your attendance and let you in.
In person attendance will be permitted for USC ISI faculty, staff, students only. Open to the public virtually via the zoom link and online.
Language models are capable of learning at inference also referred to as in context learning, learning a new task by conditioning on k examples and making a prediction for a new input with no parameter updates. While impressive, models suffer from high variance and low worst case accuracy. Moreover, we do not understand how or why in context learning works. In the first part of the talk, I will introduce new methods that lead to significant performance gains by reducing variance and improving worst case accuracy. I will present a new inference method as well as a new training method, of which combination enables the model to outperform a 230x bigger language model. In the second part of the talk, I will show that in context learning in fact works very differently from conventional learning: the model does not benefit from the correctly paired training data, but rather benefit from the correct specification of the independent distribution of inputs and labels. Finally, I will conclude the talk with lessons learned, limitations and avenues for future work.
Biography: Sewon Min is a Ph.D. student in the Paul G. Allen School of Computer Science and Engineering at the University of Washington, advised by Prof. Luke Zettlemoyer and Prof. Hannaneh Hajishirzi. She is also a part time visiting researcher at Meta AI. Her research is in the area of natural language processing and machine learning. Her work specifically focuses on question answering, natural language understanding, knowledge representation and building general purpose language understanding models. She is a recipient of the 2022 JP Morgan Ph.D. Fellowship. She has co organized multiple workshops and tutorials at ACL, EMNLP, NeurIPS and AKBC, including a workshop on Machine Reading for Question Answering, a competition on Efficient Open domain Question Answering, a workshop on Representation Learning for NLP, workshop on Semiparametric Methods in NLP, and a tutorial on Zero and Few shot Learning with Pretrained Language Models. Prior to UW, she obtained a B.S. degree in Computer Science & Engineering from Seoul National University.
Host: Jon May and Meryem M\'hamdi
More Info: https://nlg.isi.edu/nl-seminar/
WebCast Link: https://usc.zoom.us/j/94452797669
Audiences: Everyone Is Invited
Contact: Pete Zamar
Event Link: https://nlg.isi.edu/nl-seminar/