Logo: University of Southern California

Events Calendar


  • NL Seminar - Manipulating Large Language Model Predictions Through Data

    Thu, Nov 09, 2023 @ 11:00 AM - 12:00 PM

    Information Sciences Institute

    Conferences, Lectures, & Seminars


    Speaker: Alexander Wan, University of Cal-Berkeley

    Talk Title: Manipulating Large Language Model Predictions Through Data

    Series: NL Seminar

    Abstract: This talk will be a live presentation only, it will not be recorded.
    REMINDER: Meeting hosts only admit guests that they know to the Zoom meeting. Hence, you are highly encouraged to use your USC account to sign into Zoom.
    If you’re an outside visitor, please provide your: Full Name, Title and Name of Workplace to (nlg-seminar-host(at)isi.edu) beforehand so we’ll be aware of your attendance. Also, let us know if you plan to attend in-person or virtually.
    More Info on NL Seminars can be found at: https://nlg.isi.edu/nl-seminar/ 
    Large language models use large amounts of unmoderated data at each stage of the training and deployment pipeline. In this talk, I will show how these lax requirements enable adversaries to manipulate both training and test data, allowing a myriad of possible attacks. First, during training time, I will show that adversaries can modify instruction-tuning datasets to systematically manipulate predictions across a range of tasks or induce degenerate outputs across hundreds of arbitrary tasks, using as few as 100 poison examples. At inference time, additional data is often used in retrieval- or tool-augmented models. Naturally, these models will face information from a wide variety of sources that have varying degrees of quality. Humans are also faced with this same range of sources but can make judgements of trustworthiness based on factors like the style of argumentation or the recency of information. We show that not only do model predictions differ significantly from human credibility judgements, but also that gaps in this judgement creates opportunities for adversaries to manipulate answers to user queries.

    Biography: Alexander Wan is a third-year undergraduate at UC Berkeley majoring in Computer Science, Statistics, and Mathematics. He works closely with folks at the Berkeley NLP Group and the MSU Heterogeneous Learning and Reasoning lab, with a focus on improving the robustness and interpretability of large language models. He's also more broadly interested in the intersection of machine learning and cognitive science: using current ML models to better understand human cognition and building more robust models through cognitively inspired architectures and training.

    Host: Jon May and Justin Cho

    More Info: https://nlg.isi.edu/nl-seminar/

    Webcast: https://usc.zoom.us/j/95174101995

    Location: Information Science Institute (ISI) - Virtual and ISI-Conf Rm#689

    WebCast Link: https://usc.zoom.us/j/95174101995

    Audiences: Everyone Is Invited

    Contact: Pete Zamar

    Event Link: https://nlg.isi.edu/nl-seminar/

    Add to Google CalendarDownload ICS File for OutlookDownload iCal File

Return to Calendar