Logo: University of Southern California

Events Calendar

  • NL Seminar- Cultural Knowledge and Cultural Biases: Analyzing the Multilingual Performance of Text-to-Image Models

    Thu, Nov 16, 2023 @ 11:00 AM - 12:00 PM

    Information Sciences Institute

    Conferences, Lectures, & Seminars

    Speaker: Michael Saxon, UCSB

    Talk Title: Cultural Knowledge and Cultural Biases: Analyzing the Multilingual Performance of Text-to-Image Models

    Abstract: REMINDER: Meeting hosts only admit guests that they know to the Zoom meeting. Hence, you are highly encouraged to use your USC account to sign into Zoom. Meeting hosts only admit guests that they know to the Zoom meeting. Hence, you’re highly encouraged to use your USC account to sign into Zoom. If you’re an outside visitor, please provide your: Full Name, Title and Name of Workplace to (nlg-seminar-host(at)isi.edu) beforehand so we’ll be aware of your attendance. Also, let us know if you plan to attend in-person or virtually. More Info for NL Seminars can be found at: https://nlg.isi.edu/nl-seminar/ Despite being ostensibly trained on solely English data, most text-to-image (T2I) models carry some degree of multilingual capability, with significant variation in performance between models and languages. To guide the future development of T2I systems, both measuring and qualitatively analyzing these language-specific performance variations is desirable, to mitigate cross-lingual disparities in performance as well as language-specific demographic biases.To quantify multilingual performance we introduce the Conceptual Coverage Across Languages (CoCo-CroLa) benchmark, which allows us to measure the "possession" of a set of tangible noun "concepts" across English, Spanish, German, Chinese, Japanese, Hebrew, and Indonesian. This technique allows us to estimate how well-suited a model is to a target language as well as identify model-specific weaknesses, spurious correlations, and biases without any a-priori assumptions of their form. We demonstrate how it can be used to rank T2I models in terms of multilinguality, and that despite its simplicity our method captures the necessary conditions for the impressive “creative” generative abilities users expect from T2I models.We then build on this benchmarking work with a detailed qualitative analysis of “failure” and “success” cases for specific concepts. Even in the “possession” case, concepts are expressed differently across languages. These qualitative cross-lingual variations in model behaviors form a continuous spectrum of ethical acceptability, running the gamut from culturally variable popular dog breeds to racially-biased sexualization in depictions of women. While the edge cases are easy to laud or condemn, drawing the line of acceptability in between them is an open ethical question as well as an open technical challenge. Unfortunately, interventions that successfully remove the most deleterious biases also erase cultural distinctiveness, motivating a need for more targeted interventions in future work.

    Biography: Michael Saxon is a CS Ph.D. candidate in the NLP Group at the University of California, Santa Barbara. His research is driven by a desire to improve our objective understanding of the semantic capabilities of large generative AI systems, in particular generative image and language models. Toward this goal he focuses on developing novel data resources and metrics for to model semantic phenomena in generative model, as well as techniques for model-driven dataset improvement to remove biases and spurious correlations. He has previously interned at Meta AI and Amazon working on NLP and speech, and is supported by the NSF Graduate Research Fellowship Program.

    Host: Jon May and Justin Cho

    More Info: https://nlg.isi.edu/nl-seminar/

    Webcast: https://youtu.be/nlu57ZSKbi0

    Location: Information Science Institute (ISI) - Virtual and ISI-Conf Rm#689

    WebCast Link: https://youtu.be/nlu57ZSKbi0

    Audiences: Everyone Is Invited

    Contact: Pete Zamar

    Event Link: https://nlg.isi.edu/nl-seminar/

    Add to Google CalendarDownload ICS File for OutlookDownload iCal File

Return to Calendar