Conferences, Lectures, & Seminars
Events for April
-
NL Seminar -Drinking From The Firehose of Science.
Thu, Apr 13, 2023 @ 11:00 AM - 12:00 PM
Information Sciences Institute
Conferences, Lectures, & Seminars
Speaker: Waleed Ammar, Allen Inst of AI (AI2)
Talk Title: Drinking From The Firehose of Science.
Series: NL Seminar
Abstract: REMINDER:
Meeting hosts only admit guests that they know to the Zoom meeting. Hence, you are highly encouraged to use your USC account to sign into Zoom.
If you are an outside visitor, please inform us at nlg DASH seminar DASH host AT isi DOT edu beforehand so we will be aware of your attendance and let you in.
Five years ago, I visited ISI to talk about our progress taming the scientific literature in the Semantic Scholar team at the Allen Institute for Artificial Intelligence. In this talk, I will highlight some of the exciting developments in Semantic Scholar over the past few years, then share with you how we've enabled a wide variety of users and partners to "drink" from the firehose of scientific publications by interfacing with the Semantic Scholar APIs. I will end with an interactive discussion of how we can increase the participation of underrepresented groups in science.
Biography: Waleed Ammar currently leads the Semantic Scholar APIs effort at the Allen Institute for Artificial intelligence (AI2), which enables researchers, practitioners and decision makers to do various computations on the scientific literature in a wide variety of research fields. Before rejoining AI2 this year, Waleed was a senior research scientist at Google, where he helped develop transformer-based models for generating DNA sequences based on PacBio long-reads which significantly reduced variant-calling errors [Nature Biotech'22]. He also helped develop task-oriented dialog systems which are more robust to disfluencies, code-switching and user revisions [arXiv'23]. Prior to joining Google, Waleed led the Semantic Scholar research team's efforts to develop ML-based methods to facilitate access to the literature [e.g., NAACL 19], build a knowledge graph of the scientific literature [NAACL'18], and use this wealth of information to identify systemic social problems in science [JAMA'19]. He also occasionally teaches courses at UW linguistics as an affiliate faculty member. In 2016, Waleed received a Ph.D. degree in artificial intelligence from Carnegie Mellon University. Before pursuing the Ph.D., Waleed was a research engineer at Microsoft Research and a web developer at eSpace Technologies. Outside work, Waleed spends most of his time on the water or in dancing studios.
Host: Jon May and Justin Cho
More Info: https://nlg.isi.edu/nl-seminar/
Webcast: https://www.youtube.com/watch?v=SsJNCkPEDu8Location: Information Science Institute (ISI) - Virtual and ISI-Conf Rm#689
WebCast Link: https://www.youtube.com/watch?v=SsJNCkPEDu8
Audiences: Everyone Is Invited
Contact: Pete Zamar
Event Link: https://nlg.isi.edu/nl-seminar/
This event is open to all eligible individuals. USC Viterbi operates all of its activities consistent with the University's Notice of Non-Discrimination. Eligibility is not determined based on race, sex, ethnicity, sexual orientation, or any other prohibited factor. -
NL Seminar - Modular Language Models
Thu, Apr 20, 2023 @ 11:00 AM - 12:00 PM
Information Sciences Institute
Conferences, Lectures, & Seminars
Speaker: Suchin Gururangan, University of Washington, Univ. Of Washington
Talk Title: Modular Language Models
Series: NL Seminar
Abstract: REMINDER:
Meeting hosts only admit guests that they know to the Zoom meeting. Hence, you are highly encouraged to use your USC account to sign into Zoom.
If you are an outside visitor, please inform us at nlg DASH seminar DASH host AT isi DOT edu beforehand so we will be aware of your attendance and let you in.
Conventional language models (LMs) are trained densely: all parameters are updated with respect to all data. We argue that dense training leads to a variety of well-documented issues with LMs, including their prohibitive training cost and unreliable downstream behavior. We then introduce a new class of LMs that are fundamentally modular, where components (or experts) of the LM are specialized to distinct domains in the training corpus, and experts are conditionally updated based on the domain of the incoming document. We show how modularity addresses the limitations of dense training by enabling LMs that are rapidly customizable (with the ability to mix, add, or remove experts after training), embarrassingly parallel (requiring no communication between experts), and sparse (needing only a few experts active at a time for inference). Key to our proposal is exploring what constitutes the domains to which experts specialize, as well as reflecting on the data sources used to train LMs. Our new techniques chart a path towards collaborative LM development, where anyone can contribute and maintain experts at very modest computational cost.
Biography: Suchin Gururangan is a 3rd year PhD candidate at the University of Washington, advised by Noah A. Smith and Luke Zettlemoyer. He was previously a visiting researcher at Meta AI, a pre doctoral resident at the Allen Institute for AI, and spent several years in industry as a data scientist. His research interests span many areas of NLP, currently he works on modular, sparse language models that are efficient to customize and scale. His work has received awards at ACL 2020 and 2021, and he is supported by the Bloomberg Data Science PhD Fellowship.
Host: Jon May and Justin Cho
More Info: https://nlg.isi.edu/nl-seminar/
Webcast: https://www.youtube.com/watch?v=lWlVRGgwRK4Location: Information Science Institute (ISI) - Virtual and ISI-Conf Rm#689
WebCast Link: https://www.youtube.com/watch?v=lWlVRGgwRK4
Audiences: Everyone Is Invited
Contact: Pete Zamar
Event Link: https://nlg.isi.edu/nl-seminar/
This event is open to all eligible individuals. USC Viterbi operates all of its activities consistent with the University's Notice of Non-Discrimination. Eligibility is not determined based on race, sex, ethnicity, sexual orientation, or any other prohibited factor.