-
A Computational Framework for Diversity in Ensembles of Humans and Machine Systems
Mon, Apr 21, 2014 @ 03:00 PM - 05:00 PM
Ming Hsieh Department of Electrical and Computer Engineering
Conferences, Lectures, & Seminars
Speaker: Kartik Audhkhasi, University of Southern California
Talk Title: A Computational Framework for Diversity in Ensembles of Humans and Machine Systems
Abstract: My Ph.D. thesis presents a computational framework for diversity in ensembles or collections of humans and machine systems used for signal and information processing. Machine system ensembles have out-performed single systems across many pattern recognition tasks ranging from automatic speech recognition to online recommendation. Likewise, ensembles are central to computing with humans, for example, in crowd sourcing-based data tagging and annotation in human behavioral signal processing. This widespread use of ensembles, albeit largely heuristic, is motivated by their robustness to the ambiguity in production, representation, and processing of real-world information. Diversity or complementarity of the individual humans and machine systems is widely-accepted as a key ingredient in ensemble design. I will present a computational framework for this diversity by addressing three important problems - modeling, analysis, and design.
I will first propose the Globally-Variant Locally-Constant (GVLC) model for the labeling behavior of a diverse ensemble. The GVLC model captures the data-dependent reliability and diverse behavior of an ensemble through a latent state-dependent noisy channel. I will next present the Generalized Ambiguity Decomposition (GAD) theorem that defines ensemble diversity for a broad class of statistical learning loss functions and relates this diversity to ensemble performance. I will show an application of the GAD theorem by theoretically and empirically linking the diversity of an automatic speech recognition system ensemble with the word error rate of the fused hypothesis. The final part of my thesis will present techniques to design a diverse ensemble of machine systems, ranging from maximum entropy models to sequence classifiers. I will also prove that introducing diversity in the training data through careful noise addition speeds-up the maximum likelihood training of Restricted Boltzmann Machines and feed-forward neural networks.
Biography: Kartik Audhkhasi received the B.Tech. degree in Electrical Engineering and the M.Tech. degree in Information and Communication Technology from the Indian Institute of Technology, Delhi. He is currently an Electrical Engineering Ph.D. candidate at the University of Southern California (USC), Los Angeles. His research focuses on a computational framework for diversity in ensembles of humans and machine systems for signal and information processing. He is broadly interested in statistical signal processing, speech processing and recognition, machine learning, and human-centered computing.
He is a recipient of the Annenberg fellowship, the IBM Ph.D. fellowship, and was a 2012 USC Ming Hsieh Institute Ph.D. Scholar. Kartik was part of the USC team that won the Interspeech-2013 Computational Paralinguistics Challenge. He has also received best paper and best teaching assistant awards from the Electrical Engineering Department at USC.
Host: Prof. Shrikanth S. Narayanan
Location: Ronald Tutor Hall of Engineering (RTH) - 320
Audiences: Everyone Is Invited
Contact: Talyia Veal