-
NL Seminar - Manipulating Large Language Model Predictions Through Data
Thu, Nov 09, 2023 @ 11:00 AM - 12:00 PM
Information Sciences Institute
Conferences, Lectures, & Seminars
Speaker: Alexander Wan, University of Cal-Berkeley
Talk Title: Manipulating Large Language Model Predictions Through Data
Series: NL Seminar
Abstract: This talk will be a live presentation only, it will not be recorded.
REMINDER: Meeting hosts only admit guests that they know to the Zoom meeting. Hence, you are highly encouraged to use your USC account to sign into Zoom.
If you’re an outside visitor, please provide your: Full Name, Title and Name of Workplace to (nlg-seminar-host(at)isi.edu) beforehand so we’ll be aware of your attendance. Also, let us know if you plan to attend in-person or virtually.
More Info on NL Seminars can be found at: https://nlg.isi.edu/nl-seminar/
Large language models use large amounts of unmoderated data at each stage of the training and deployment pipeline. In this talk, I will show how these lax requirements enable adversaries to manipulate both training and test data, allowing a myriad of possible attacks. First, during training time, I will show that adversaries can modify instruction-tuning datasets to systematically manipulate predictions across a range of tasks or induce degenerate outputs across hundreds of arbitrary tasks, using as few as 100 poison examples. At inference time, additional data is often used in retrieval- or tool-augmented models. Naturally, these models will face information from a wide variety of sources that have varying degrees of quality. Humans are also faced with this same range of sources but can make judgements of trustworthiness based on factors like the style of argumentation or the recency of information. We show that not only do model predictions differ significantly from human credibility judgements, but also that gaps in this judgement creates opportunities for adversaries to manipulate answers to user queries.
Biography: Alexander Wan is a third-year undergraduate at UC Berkeley majoring in Computer Science, Statistics, and Mathematics. He works closely with folks at the Berkeley NLP Group and the MSU Heterogeneous Learning and Reasoning lab, with a focus on improving the robustness and interpretability of large language models. He's also more broadly interested in the intersection of machine learning and cognitive science: using current ML models to better understand human cognition and building more robust models through cognitively inspired architectures and training.
Host: Jon May and Justin Cho
More Info: https://nlg.isi.edu/nl-seminar/
Webcast: https://usc.zoom.us/j/95174101995Location: Information Science Institute (ISI) - Virtual and ISI-Conf Rm#689
WebCast Link: https://usc.zoom.us/j/95174101995
Audiences: Everyone Is Invited
Contact: Pete Zamar
Event Link: https://nlg.isi.edu/nl-seminar/