Logo: University of Southern California

Events Calendar

  • PhD Defense - Christopher Wienberg

    Tue, Mar 28, 2017 @ 11:00 AM - 01:00 PM

    Thomas Lord Department of Computer Science

    University Calendar

    Title: Demographic Bias Correction for Social Media Data

    PhD Candidate: Christopher Wienberg

    Date and Time: Tuesday, March 28th, 11:00am
    Location: Zumberge Hall (ZHS) 360

    For generations, people have been keeping records of their everyday lives. The web is now a popular place for people to document their personal lives, replacing journals and diaries popular decades ago. The popularity of weblogs and social media has provided an unique opportunity to study people at a massive scale. Social media researchers have seized this chance to use social media data to predict and measure social phenomena, such as elections, economic activity, and public health. While these researchers' work has shown promise, they frequently highlight a challenge with web data: web users, as a group, are dissimilar (e.g. younger, wealthier) from most offline populations.

    Demographic representativity is an issue that economists and other social scientists deal with regularly. They have found that re-weighting survey samples based on demographic variables like age and gender can improve the accuracy of survey results. They directly account for this need by asking survey respondents to provide their demographic background. In contrast, social media analysts do not have immediate access to these demographic variables.

    This dissertation proposes and evaluates a practical approach for making social predictions from social media data while contending with demographic representativity issues. It describes the collection and analysis of reliable data describing a population of web users. Social predictions are drawn from this population, with various bias correction techniques evaluated by comparing to gold standard data from traditionally collected surveys. Special attention is paid to important practical considerations, such as errors introduced by automated methods to characterize the demographic and other attributes of individual users and their impact on predictions for the broader population.

    Andrew S. Gordon (chair)
    Ellis Horowitz
    Arie Kapteyn

    Location: James H. Zumberge Hall Of Science (ZHS) - 360

    Audiences: Everyone Is Invited

    Contact: Lizsl De Leon


Return to Calendar