Logo: University of Southern California

Events Calendar


  • NL Seminar: The Surprising Effectiveness of Membership Inference with Simple N-Gram Coverage

    Thu, Apr 24, 2025 @ 11:00 AM - 12:00 PM

    Information Sciences Institute

    Conferences, Lectures, & Seminars


    Speaker: Skyler Hallinan, USC

    Talk Title: The Surprising Effectiveness of Membership Inference with Simple N-Gram Coverage

    Series: NL Seminar

    Abstract: Meeting hosts only admit on-line guests that they know to the Zoom meeting. Hence, you’re highly encouraged to use your USC account to sign into Zoom. If you’re an outside visitor, please inform us at ((nlg-seminar-host(at)isi.edu) to make us aware of your attendance so we can admit you. Specify if you will attend remotely or in person at least one business day prior to the event. Provide your: full name, job title and professional affiliation and arrive at least 10 minutes before the seminar begins. If you do not have access to the 6th Floor for in-person attendance, please check in at the 10th floor main reception desk to register as a visitor and someone will escort you to the conference room location.  Join Via Zoom: https://usc.zoom.us/j/96791099940?pwd=6kov3zTLAnD4JU49d1VtX4XNAZMcvs.1Meeting ID: 967 9109 9940Passcode: 840282           
    Membership inference attacks serves as useful tool for fair use of language models, such as detecting potential copyright infringement and auditing data leakage. However, many current state-of-the-art attacks require access to models' hidden states or probability distribution, which prevents investigation into more widely-used, API-access only models like GPT-4. In this work, we introduce N-Gram Coverage Attack, a membership inference attack that relies solely on text outputs from the target model, enabling attacks on completely black-box models. We leverage the observation that models are more likely to memorize and subsequently generate text patterns that were commonly observed in their training data. Specifically, to make a prediction on a candidate member, N-Gram Coverage Attack first obtains multiple model generations conditioned on a prefix of the candidate. It then uses n-gram overlap metrics to compute and aggregate the similarities of these outputs with the ground truth suffix; high similarities indicate likely membership. We first demonstrate on a diverse set of existing benchmarks that N-Gram Coverage Attack outperforms other black-box methods while also impressively achieving comparable or even better performance to state-of-the-art white-box attacks --- despite having access to only text outputs. Interestingly, we find that the success rate of our method scales with the attack compute budget --- as we increase the number of sequences generated from the target model conditioned on the prefix, attack performance tends to improve. Having verified the accuracy of our method, we use it to investigate previously unstudied closed OpenAI models on multiple domains. We find that more recent models, such as GPT-4o, exhibit increased robustness to membership inference, suggesting an evolving trend toward improved privacy protections.

    Biography: Skyler Hallinan is a Ph.D. student in Computer Science at the University of Southern California where he is advised by Xiang Ren. His research aims to build trustworthy AI systems with robust reasoning capabilities via data-centric approaches. His work spans three core areas: understanding how data impacts downstream model behavior, safeguarding user data and privacy, and advancing model capabilities with better data. Previously, he was a research intern at Apple and Amazon, and received a B.S./M.S. in Computer Science from the University of Washington, where he was advised by Yejin Choi.

    Host: Jonathan May and Katy Felkner

    More Info: https://www.isi.edu/research-groups-nlg/nlg-seminars/

    Webcast: https://usc.zoom.us/j/96791099940?pwd=6kov3zTLAnD4JU49d1VtX4XNAZMcvs.1

    Location: Information Science Institute (ISI) - Conf Rm#689

    WebCast Link: https://usc.zoom.us/j/96791099940?pwd=6kov3zTLAnD4JU49d1VtX4XNAZMcvs.1

    Audiences: Everyone Is Invited

    Contact: Pete Zamar

    Event Link: https://www.isi.edu/research-groups-nlg/nlg-seminars/


    This event is open to all eligible individuals. USC Viterbi operates all of its activities consistent with the University's Notice of Non-Discrimination. Eligibility is not determined based on race, sex, ethnicity, sexual orientation, or any other prohibited factor.


Return to Calendar