Tue, Mar 28, 2017 @ 11:00 AM - 01:00 PM
Thomas Lord Department of Computer Science
Title: Demographic Bias Correction for Social Media Data
PhD Candidate: Christopher Wienberg
Date and Time: Tuesday, March 28th, 11:00am
Location: Zumberge Hall (ZHS) 360
For generations, people have been keeping records of their everyday lives. The web is now a popular place for people to document their personal lives, replacing journals and diaries popular decades ago. The popularity of weblogs and social media has provided an unique opportunity to study people at a massive scale. Social media researchers have seized this chance to use social media data to predict and measure social phenomena, such as elections, economic activity, and public health. While these researchers\' work has shown promise, they frequently highlight a challenge with web data: web users, as a group, are dissimilar (e.g. younger, wealthier) from most offline populations.
Demographic representativity is an issue that economists and other social scientists deal with regularly. They have found that re-weighting survey samples based on demographic variables like age and gender can improve the accuracy of survey results. They directly account for this need by asking survey respondents to provide their demographic background. In contrast, social media analysts do not have immediate access to these demographic variables.
This dissertation proposes and evaluates a practical approach for making social predictions from social media data while contending with demographic representativity issues. It describes the collection and analysis of reliable data describing a population of web users. Social predictions are drawn from this population, with various bias correction techniques evaluated by comparing to gold standard data from traditionally collected surveys. Special attention is paid to important practical considerations, such as errors introduced by automated methods to characterize the demographic and other attributes of individual users and their impact on predictions for the broader population.
Andrew S. Gordon (chair)
Audiences: Everyone Is Invited
Contact: Lizsl De Leon