-
PhD Dissertation Defense - Zihao He
Wed, Feb 05, 2025 @ 12:00 PM - 02:00 PM
Thomas Lord Department of Computer Science
University Calendar
Title: Aligning Large Language Models with Human Perspectives
Date & Time: Wednesday, February 5th - 12:00p - 2:00p
Location: RTH 306
Committee: Kristina Lerman (Chair, CS), Emilio Ferrara (CS), Marlon Twyman (Communication)
Abstract: Large Language Models (LLMs) are increasingly deployed in real-world applications. However, their ability to accurately represent diverse human perspectives remains a critical challenge. This thesis investigates LLM alignment, which refers to how closely these models reflect the ideologies, values, and communication styles of specific communities. First, I develop methods for aligning LLMs to online communities and introduce Community-Cross-Instruct, a framework that generates structured instruction-answer pairs to enhance fidelity and scalability. Second, I propose comprehensive evaluation frameworks to assess alignment beyond positional stances, including affective alignment (how well LLMs capture emotional and moral tones) and multidimensional evaluations across authenticity, toxicity, and harm. Finally, I explore ethical risks in alignment, demonstrating how minimal biased data during instruction tuning can shift an LLM’s behavior, raising concerns about ideological manipulation. These findings highlight the technical, evaluation, and ethical complexities of LLM alignment, providing a foundation for ensuring that LLMs reflect diverse human perspectives and stay robust to ideological manipulation.
Zoom Link: https://usc.zoom.us/j/97020518118?pwd=mZeDv2WhswDGTouNvvWFI9NFqhO5KR.1Location: Ronald Tutor Hall of Engineering (RTH) - 306
Audiences: Everyone Is Invited
Contact: Zihao He