Logo: University of Southern California

Events Calendar


  • AI Seminar-Experiments in Scaling Reinforcement Learning with Verifiable Rewards

    Fri, Apr 04, 2025 @ 11:00 AM - 12:00 PM

    Information Sciences Institute

    Conferences, Lectures, & Seminars


    Speaker: Nathan Lambert, Allen Institure

    Talk Title: Experiments in Scaling Reinforcement Learning with Verifiable Rewards

    Series: AI Seminar

    Abstract: With the release of DeepSeek’s R1 reasoning model, interest in reinforcement learning may be at an all time high. Academics are pouring energy into the space, trying to replicate DeepSeek’s results and establish clear trade-offs and capabilities of this new era of reinforcement learning on language models. This talk discusses these new results with language models trained with Reinforcement Learning with Verifiable Rewards (RLVR), our efforts at scaling them for Ai2’s OLMo and Tülu language models, hints that we may have missed indicating that RL is more effective than people give credit for, and some history from my background in model-based RL/robotics. The goal of the talk is to present a mix of (recent) historical context on language modeling and cutting edge research with RL to forecast how the rapidly expanding industry of language models may change in the near future.
     

    Biography: Nathan Lambert is a Senior Research Scientist and post-training lead at the Allen Institute for AI focusing on building open language models. At the same time he founded and operates Interconnects.ai to increase transparency and understanding of current AI models and systems.
    Previously, he helped build an RLHF research team at HuggingFace. He received his PhD from the University of California, Berkeley working at the intersection of machine learning and robotics. He was advised by Professor Kristofer Pister in the Berkeley Autonomous Microsystems Lab and Roberto Calandra at Meta AI Research. He was lucky to intern at Facebook AI and DeepMind during his Ph.D. Nathan was was awarded the UC Berkeley EECS Demetri Angelakos Memorial Achievement Award for Altruism for his efforts to better community norms.
    If speaker approves to be recorded for this seminar it will be posted on the USC/ISI YouTube page within 1-2 business days: https://www.youtube.com/user/USCISI.
    Subscribe here to learn more about upcoming seminars: https://www.isi.edu/events/ .

    Host: Eric Boxer and Justina Gilleland

    More Info: https://www.isi.edu/events/5553/experiments-in-scaling-reinforcement-learning-with-verifiable-rewards/

    Webcast: https://usc.zoom.us/j/94409584905?pwd=Sm5LVkd0bndUdEluM3piK0NWTUQrUT09

    Location: Information Science Institute (ISI) - Virtual Only

    WebCast Link: https://usc.zoom.us/j/94409584905?pwd=Sm5LVkd0bndUdEluM3piK0NWTUQrUT09

    Audiences: Everyone Is Invited

    Contact: Pete Zamar

    Event Link: https://www.isi.edu/events/5553/experiments-in-scaling-reinforcement-learning-with-verifiable-rewards/


    This event is open to all eligible individuals. USC Viterbi operates all of its activities consistent with the University's Notice of Non-Discrimination. Eligibility is not determined based on race, sex, ethnicity, sexual orientation, or any other prohibited factor.

    Add to Google CalendarDownload ICS File for OutlookDownload iCal File

Return to Calendar