Logo: University of Southern California

Events Calendar



Select a calendar:



Filter June Events by Event Type:


SUNMONTUEWEDTHUFRISAT

Conferences, Lectures, & Seminars
Events for June

  • NL Seminar Sources of Variance in Pretraining and Finetuning LLMs

    Mon, Jun 13, 2022 @ 02:00 PM - 03:00 PM

    Information Sciences Institute

    Conferences, Lectures, & Seminars


    Speaker: Naomi Saphra, NYU

    Talk Title: Sources of Variance in Pretraining and Finetuning LLMs

    Series: NL Seminar

    Abstract: REMINDER
    Meeting hosts only admit guests that they know to the Zoom meeting. Hence, you are highly encouraged to use your USC account to sign into Zoom.

    If you are an outside visitor, please inform us at (nlg DASH seminar DASH host AT isi DOR edu beforehand so we will be aware of your attendance and let you in.

    You have engaged in the very modern practice of transfer learning. You pretrained a model on a self supervised objective, then you finetuned it on a downstream task, and you find excellent performance on the test set. Aha, you say. I found a good pretraining procedure. Did you? You try finetuning again. The results are terrible! Aha, you say. I found a bad finetuning procedure. Did you?

    The random seeds for both pretraining and finetuning stages have a substantial influence on outcome. However, it is computationally expensive to pretrain new models, so measuring the robustness of a procedure across different seeds can be prohibitive. This talk will address, first, the influence that a pretraining seed has on both in domain and OOD performance. Then we will address the role of the finetuning seed. Much variation in OOD generalization can be ascribed to where the finetuning seeds direct SGD trajectories. In particular, we discuss how to predict generalization behavior in a finetuned model, based on topographic properties of its region of the loss surface. By understanding the degree of influence that random seeds have on performance, we can fairly evaluate a robust training procedure, rather than a single set of parameters. By understanding the mechanism of that influence, we can go further by developing improved training methods.


    Biography: Naomi has interests relating to NLP learning dynamics how models learn to encode linguistic structure, and how we can encode useful inductive biases into the training process. Having earned a PhD from University of Edinburgh, they are now a postdoc at NYU. In their spare time, they play roller derby under the name Gaussian Retribution, do standup comedy, and shepherd programmers who cannot type into the world of code dictation.

    Host: Jon May and Thamme Gowda

    More Info: https://nlg.isi.edu/nl-seminar/

    Webcast: https://www.youtube.com/watch?v=Lni4PIlbJjI

    Location: Information Science Institute (ISI) - Virtual

    WebCast Link: https://www.youtube.com/watch?v=Lni4PIlbJjI

    Audiences: Everyone Is Invited

    Contact: Pete Zamar

    OutlookiCal
  • NL Seminar-Weighted Finite-State Transducers: The Later Years

    Thu, Jun 23, 2022 @ 11:00 AM - 12:00 PM

    Information Sciences Institute

    Conferences, Lectures, & Seminars


    Speaker: Kyle Gorman, Graduate Center, City University of New York and Google Inc.

    Talk Title: Weighted Finite-State Transducers: The Later Years

    Series: NL Seminar

    Abstract: REMINDER
    Meeting hosts only admit guests that they know to the Zoom meeting. Hence, you are highly encouraged to use your USC account to sign into Zoom.

    If you are an outside visitor, please inform us at (nlg DASH seminar DASH host AT isi DOR edu beforehand so we will be aware of your attendance and let you in.

    In-person attendance will be permitted for USC ISI faculty, staff, students only. Open to the public virtually via the zoom registration link and online.

    While the deep learning tsunami defines the state of the art in speech and language processing, finite state transducer grammars developed by linguists and engineers are still widely used in highly multilingual settings, particularly for front end speech applications. In this talk, I will first briefly review the current state of the OpenFst and OpenGrm finite state transducer libraries. I will then discuss several recent innovations in the finite state world. These include algorithms for inducing text normalization and grapheme to phoneme grammars from parallel data, heuristic optimization of arbitrary weighted transducers, and an algorithm for efficiently computing the single shortest string of a wider variety of non deterministic weighted acceptors.

    Biography: Kyle Gorman is an assistant professor of linguistics at the Graduate Center, City University of New York, and director of the masters program in computational linguistics. He is also a software engineer in the speech and language algorithms group at Google. With Richard Sproat, he is the coauthor of Finite State Text Processing and the creator of Pynini, a finite state text processing library for Python. He has also published on statistical methods for comparing computational models, text normalization, grapheme to phoneme conversion, and morphological analysis, as well as many topics in linguistic theory.

    Host: Jon May and Thamme Gowda

    More Info: https://nlg.isi.edu/nl-seminar/

    Webcast: https://www.youtube.com/watch?v=BpEqB3Vj4mM

    Location: Information Science Institute (ISI) - Virtual

    WebCast Link: https://www.youtube.com/watch?v=BpEqB3Vj4mM

    Audiences: Everyone Is Invited

    Contact: Pete Zamar

    OutlookiCal
  • NL Seminar-Anti Queer Bias in Large Language Models

    Thu, Jun 30, 2022 @ 03:00 PM - 04:00 PM

    Information Sciences Institute

    Conferences, Lectures, & Seminars


    Speaker: Katy Felkner, USC/ISI

    Talk Title: Anti-Queer Bias in Large Language Models

    Series: NL Seminar

    Abstract: REMINDER:
    Meeting hosts only admit guests that they know to the Zoom meeting. Hence, you are highly encouraged to use your USC account to sign into Zoom.

    If you are an outside visitor, please inform us at (nlg DASH seminar DASH host AT isi DOR edu beforehand so we will be aware of your attendance and let you in.

    In-person attendance will be permitted for USC ISI faculty, staff, students only. Open to the public virtually via the zoom registration link and online.

    Happy Pride. To close out Pride Month at ISI, this talk will discuss fairness and bias in LLMs as it relates to the LGBTQ community. We will explore current methods for detecting and mitigating bias in LLMs, as well as the lack of current research focusing specifically on homophobic and transphobic biases. The talk will present recent exploratory work on whether and to what extent biases against queer and trans people are encoded in large language models LLMs such as BERT. It will discuss a new method for reducing these biases in downstream tasks: fine-tuning the models on data written by and or about queer people. It will also discuss a new benchmark dataset, WinoQueer, modeled after other bias detection benchmarks but addressing homophobic and transphobic biases. This work was accepted to the Queer in AI workshop at NAACL 2022.

    Biography: Katy Felkner is a rising 3rd year PhD student at USC Information Sciences Institute. Her primary research focus is extremely low-resource machine translation. She is also interest in fairness and bias in large language models. Prior to USC, she received dual bachelors degrees in Computer Science and Letters general humanities from the University of Oklahoma. Her research is supported by an NSF Graduate Research Fellowship. Katy is passionate about making computer science more welcoming for women and queer students.

    Host: Jon May and Thamme Gowda

    More Info: https://nlg.isi.edu/nl-seminar/

    Webcast: https://usc.zoom.us/j/96713436677

    Location: Information Science Institute (ISI) - Virtual

    WebCast Link: https://usc.zoom.us/j/96713436677

    Audiences: Everyone Is Invited

    Contact: Pete Zamar

    OutlookiCal