Logo: University of Southern California

Events Calendar


  • PhD Dissertation Defense - Mozhdeh Gheini

    Fri, Jan 24, 2025 @ 04:00 PM - 05:00 PM

    Thomas Lord Department of Computer Science

    University Calendar


    Title: Inductive Biases for Data- and Parameter-Efficient Transfer Learning
     
    Date and Time: Fri, Jan 24, 2025 @ 10:00 AM - 12:00 PM
     
    Location: Salvatori Computer Science Center (SAL) - 213 and https://usc.zoom.us/j/6564802162
     
    Committee Members: Jonathan May (Chair), Emilio Ferrara, Xuezhe Ma, Khalil Iskarous
     
    Abstract: Data- and resource-intensive pre-training and fine-tuning applied upon Transformer-based models is the dominant paradigm at the forefront of rapid advancements in natural language processing, human language technologies, and most notably, large language models. Such reliance on massive amounts of data, computation, and energy, while effective and impressive from a performance-only perspective, can hinder open, nonexclusive, and sustainable development of these technologies. In this talk, we present how certain inductive biases can be devised to adjust current natural language methods under resource-constrained scenarios and provide insights into why the proposed inductive biases are successful in such cases.  
     
    Specifically, we discuss four research directions on data and parameter efficiency of fine-tuning and transfer learning in natural language processing: (1) a universal regimen that creates a single pre-trained checkpoint suitable for machine translation transfer to practically any language pair and eliminates the need for ad hoc pre-training; (2) an architecture-guided parameter-efficient fine-tuning method that performs competitively with full fine-tuning while exclusively updating cross-attention parameters; (3) an analysis of MEGA, a recently introduced augmentation of the Transformer architecture to incorporate explicit recency bias, through the lens of transfer learning; and (4) a meta-learning algorithm to prime pre-trained models for specific fine-tuning strategies.  
     
    Combined with ablations that show how they are effective and analyses that demonstrate their generalizability, these directions are meant to serve as tools for resource-efficient transfer learning for natural language processing.  

    Location: Henry Salvatori Computer Science Center (SAL) - 213

    Audiences: Everyone Is Invited

    Contact: Mozhdeh Gheini

    Add to Google CalendarDownload ICS File for OutlookDownload iCal File

Return to Calendar