USC - Viterbi School of Engineering

Subscribe:
Login

Select a calendar:

Filter November Events by Event Type:

Conferences, Lectures, & Seminars (4)

<< Previous Month

Next Month >>

SUNMONTUEWEDTHUFRISAT

View Events for:
Today
The Week
The Month

Conferences, Lectures, & Seminars
Events for November

Nov
02

NL Seminar- What We Learned from 570K ChatGPT Interaction Logs In The Wild
Thu, Nov 02, 2023 @ 11:00 AM - 12:00 PM
Information Sciences Institute
Conferences, Lectures, & Seminars

Speaker: Wenting Zhao, Cornell University

Talk Title: What We Learned from 570K ChatGPT Interaction Logs In The Wild

Series: NL Seminar

Abstract: Reminder: Meeting hosts only admit guests that they know to the Zoom meeting. Hence, you are highly encouraged to use your USC account to sign into Zoom. If you are an outside visitor, please inform us at nlg DASH seminar DASH host AT isi DOT edu beforehand so we will be aware of your attendance and let you in. In-person attendance will be permitted for USC/ISI faculty, staff, students only. Open to the public virtually via the zoom link. More Info: https://nlg.isi.edu/nl-seminar/ Chatbots such as GPT 4 and ChatGPT are currently serving millions of users. Despite their widespread use, there remains a lack of public datasets that showcase how these tools are used by users in practice. In this talk, I will introduce in the WildChat, a corpus of 570K user ChatGPT conversations, which comprises over 1.5 million interaction turns. I will show that, compared to other popular user-chatbot interaction datasets, WildChat offers the most diverse user prompts and presents the richest variety of potentially toxic use-cases. Finally, I will demonstrate the potential utility of this dataset in fine-tuning state-of-the-art instruction following models.

Biography: Wenting Zhao is a Ph.D. candidate in Computer Science at Cornell University. Her research focuses on improving reasoning capabilities of large language models by exploiting explicit problem structures. She organizes an ACL tutorial on complex reasoning over Natural Language and the second workshop on Natural Language Reasoning and Structured Explanations. She has done internships at IBM Research, Amazon Alexa, and AI2 Mosaic.

Host: Jon May and Justin Cho

More Info: https://nlg.isi.edu/nl-seminar/

Webcast: https://youtu.be/lx1XcTdhalU
Location: Information Science Institute (ISI) - Virtual and ISI-Conf Rm#689
WebCast Link: https://youtu.be/lx1XcTdhalU
Audiences: Everyone Is Invited

Contact: Pete Zamar

Event Link: https://nlg.isi.edu/nl-seminar/

This event is open to all eligible individuals. USC Viterbi operates all of its activities consistent with the University's Notice of Non-Discrimination. Eligibility is not determined based on race, sex, ethnicity, sexual orientation, or any other prohibited factor.
Add to Google Calendar
Nov
09

NL Seminar - Manipulating Large Language Model Predictions Through Data
Thu, Nov 09, 2023 @ 11:00 AM - 12:00 PM
Information Sciences Institute
Conferences, Lectures, & Seminars

Speaker: Alexander Wan, University of Cal-Berkeley

Talk Title: Manipulating Large Language Model Predictions Through Data

Series: NL Seminar

Abstract: This talk will be a live presentation only, it will not be recorded.
REMINDER: Meeting hosts only admit guests that they know to the Zoom meeting. Hence, you are highly encouraged to use your USC account to sign into Zoom.
If you’re an outside visitor, please provide your: Full Name, Title and Name of Workplace to (nlg-seminar-host(at)isi.edu) beforehand so we’ll be aware of your attendance. Also, let us know if you plan to attend in-person or virtually.
More Info on NL Seminars can be found at: https://nlg.isi.edu/nl-seminar/
Large language models use large amounts of unmoderated data at each stage of the training and deployment pipeline. In this talk, I will show how these lax requirements enable adversaries to manipulate both training and test data, allowing a myriad of possible attacks. First, during training time, I will show that adversaries can modify instruction-tuning datasets to systematically manipulate predictions across a range of tasks or induce degenerate outputs across hundreds of arbitrary tasks, using as few as 100 poison examples. At inference time, additional data is often used in retrieval- or tool-augmented models. Naturally, these models will face information from a wide variety of sources that have varying degrees of quality. Humans are also faced with this same range of sources but can make judgements of trustworthiness based on factors like the style of argumentation or the recency of information. We show that not only do model predictions differ significantly from human credibility judgements, but also that gaps in this judgement creates opportunities for adversaries to manipulate answers to user queries.

Biography: Alexander Wan is a third-year undergraduate at UC Berkeley majoring in Computer Science, Statistics, and Mathematics. He works closely with folks at the Berkeley NLP Group and the MSU Heterogeneous Learning and Reasoning lab, with a focus on improving the robustness and interpretability of large language models. He's also more broadly interested in the intersection of machine learning and cognitive science: using current ML models to better understand human cognition and building more robust models through cognitively inspired architectures and training.

Host: Jon May and Justin Cho

More Info: https://nlg.isi.edu/nl-seminar/

Webcast: https://usc.zoom.us/j/95174101995
Location: Information Science Institute (ISI) - Virtual and ISI-Conf Rm#689
WebCast Link: https://usc.zoom.us/j/95174101995
Audiences: Everyone Is Invited

Contact: Pete Zamar

Event Link: https://nlg.isi.edu/nl-seminar/

This event is open to all eligible individuals. USC Viterbi operates all of its activities consistent with the University's Notice of Non-Discrimination. Eligibility is not determined based on race, sex, ethnicity, sexual orientation, or any other prohibited factor.
Add to Google Calendar
Nov
16

NL Seminar- Cultural Knowledge and Cultural Biases: Analyzing the Multilingual Performance of Text-to-Image Models
Thu, Nov 16, 2023 @ 11:00 AM - 12:00 PM
Information Sciences Institute
Conferences, Lectures, & Seminars

Speaker: Michael Saxon, UCSB

Talk Title: Cultural Knowledge and Cultural Biases: Analyzing the Multilingual Performance of Text-to-Image Models

Abstract: REMINDER: Meeting hosts only admit guests that they know to the Zoom meeting. Hence, you are highly encouraged to use your USC account to sign into Zoom. Meeting hosts only admit guests that they know to the Zoom meeting. Hence, you’re highly encouraged to use your USC account to sign into Zoom. If you’re an outside visitor, please provide your: Full Name, Title and Name of Workplace to (nlg-seminar-host(at)isi.edu) beforehand so we’ll be aware of your attendance. Also, let us know if you plan to attend in-person or virtually. More Info for NL Seminars can be found at: https://nlg.isi.edu/nl-seminar/ Despite being ostensibly trained on solely English data, most text-to-image (T2I) models carry some degree of multilingual capability, with significant variation in performance between models and languages. To guide the future development of T2I systems, both measuring and qualitatively analyzing these language-specific performance variations is desirable, to mitigate cross-lingual disparities in performance as well as language-specific demographic biases.To quantify multilingual performance we introduce the Conceptual Coverage Across Languages (CoCo-CroLa) benchmark, which allows us to measure the "possession" of a set of tangible noun "concepts" across English, Spanish, German, Chinese, Japanese, Hebrew, and Indonesian. This technique allows us to estimate how well-suited a model is to a target language as well as identify model-specific weaknesses, spurious correlations, and biases without any a-priori assumptions of their form. We demonstrate how it can be used to rank T2I models in terms of multilinguality, and that despite its simplicity our method captures the necessary conditions for the impressive “creative” generative abilities users expect from T2I models.We then build on this benchmarking work with a detailed qualitative analysis of “failure” and “success” cases for specific concepts. Even in the “possession” case, concepts are expressed differently across languages. These qualitative cross-lingual variations in model behaviors form a continuous spectrum of ethical acceptability, running the gamut from culturally variable popular dog breeds to racially-biased sexualization in depictions of women. While the edge cases are easy to laud or condemn, drawing the line of acceptability in between them is an open ethical question as well as an open technical challenge. Unfortunately, interventions that successfully remove the most deleterious biases also erase cultural distinctiveness, motivating a need for more targeted interventions in future work.

Biography: Michael Saxon is a CS Ph.D. candidate in the NLP Group at the University of California, Santa Barbara. His research is driven by a desire to improve our objective understanding of the semantic capabilities of large generative AI systems, in particular generative image and language models. Toward this goal he focuses on developing novel data resources and metrics for to model semantic phenomena in generative model, as well as techniques for model-driven dataset improvement to remove biases and spurious correlations. He has previously interned at Meta AI and Amazon working on NLP and speech, and is supported by the NSF Graduate Research Fellowship Program.

Host: Jon May and Justin Cho

More Info: https://nlg.isi.edu/nl-seminar/

Webcast: https://youtu.be/nlu57ZSKbi0
Location: Information Science Institute (ISI) - Virtual and ISI-Conf Rm#689
WebCast Link: https://youtu.be/nlu57ZSKbi0
Audiences: Everyone Is Invited

Contact: Pete Zamar

Event Link: https://nlg.isi.edu/nl-seminar/

This event is open to all eligible individuals. USC Viterbi operates all of its activities consistent with the University's Notice of Non-Discrimination. Eligibility is not determined based on race, sex, ethnicity, sexual orientation, or any other prohibited factor.
Add to Google Calendar
Nov
30

NL Seminar
Thu, Nov 30, 2023 @ 11:00 AM - 12:00 PM
Information Sciences Institute
Conferences, Lectures, & Seminars

Speaker: Kawin Ethayarajh, Stanford University

Talk Title: Machine Learning with Human Fault-Tolerance

Abstract: REMINDER: This talk will be a live presentation only, it will not be recorded. Meeting hosts only admit guests that they know to the Zoom meeting. Hence, you’re highly encouraged to use your USC account to sign into Zoom. If you’re an outside visitor, please provide your: Full Name, Title and Name of Workplace to (nlg-seminar-host(at)isi.edu) beforehand so we’ll be aware of your attendance. Also, let us know if you plan to attend in-person or virtually. More Info for NL Seminars can be found at: https://nlg.isi.edu/nl-seminar/ In machine learning, we have long recognized the need to build systems that can tolerate hardware faults and software faults. In this talk, I propose the need for a third kind of fault-tolerance: human fault-tolerance. The methods used to develop, evaluate, and deploy machine learning systems today assume that the humans that build and use them are rational actors making highly-informed decisions based on consistent preferences—this is far from true in practice. We can address the failures of these assumptions by drawing from economics, a field that has long been aware of how unfounded beliefs about human behavior can go wrong. Specifically, I will cover how we can develop theoretically grounded tools that discover human mistakes, design algorithms and methods for robustly eliciting and incorporating human feedback, and implement end-to-end platforms that make ML and NLP more transparent and reproducible. This line of work has led to the creation of datasets, models, and platforms that have been widely adopted by industry giants like Amazon, Google, and Meta.

Biography: Kawin Ethayarajh is a 5th year PhD student at Stanford University, where he works on bringing human fault-tolerance to machine learning. His research draws from economics to make machine learning and NLP more robust to the irrational, inconsistent, and uninformed human decisions made at every step. His work has been supported by a Facebook Fellowship and an NSERC PGS-D, and he has received an Outstanding Paper Award at ICML 2022. He co-created the Stanford Human Preferences dataset and the Dynaboard platform (behind Dynabench).

Host: Jon May and Justin Cho

More Info: https://nlg.isi.edu/nl-seminar/

Webcast: https://usc.zoom.us/j/99484520082
Location: Information Science Institute (ISI) - Virtual and ISI-Conf Rm#689
WebCast Link: https://usc.zoom.us/j/99484520082
Audiences: Everyone Is Invited

Contact: Pete Zamar

Event Link: https://nlg.isi.edu/nl-seminar/

This event is open to all eligible individuals. USC Viterbi operates all of its activities consistent with the University's Notice of Non-Discrimination. Eligibility is not determined based on race, sex, ethnicity, sexual orientation, or any other prohibited factor.
Add to Google Calendar

Events Calendar

Select a calendar:

Filter November Events by Event Type:

Conferences, Lectures, & SeminarsEvents for November

NL Seminar- What We Learned from 570K ChatGPT Interaction Logs In The Wild

NL Seminar - Manipulating Large Language Model Predictions Through Data

NL Seminar- Cultural Knowledge and Cultural Biases: Analyzing the Multilingual Performance of Text-to-Image Models

NL Seminar

Conferences, Lectures, & Seminars
Events for November