-
Improving Features and Models for Automatic Emotion Prediction in Acted Speech
Wed, Nov 14, 2012 @ 11:00 AM - 12:30 PM
Ming Hsieh Department of Electrical and Computer Engineering
Conferences, Lectures, & Seminars
Speaker: Ani Nenkova, University of Pennsylvania
Talk Title: Improving Features and Models for Automatic Emotion Prediction in Acted Speech
Abstract: In this talk I will present our recent work on emotion prediction in acted speech, as well as validation on spontaneous speech.
We introduce a class of spectral features computed over three phoneme type classes of interestâstressed vowels, unstressed vowels and consonants in the utterance. Classification accuracies are consistently higher for our features compared to prosodic or utterance-level spectral features. Combination of our phoneme class features with prosodic features leads to even further improvement. Further analyses reveal that spectral features computed from consonant regions of the utterance contain more information about emotion than either stressed or unstressed vowel features. We also explore how emotion recognition accuracy depends on utterance length. We show that, while there is no significant dependence for utterance-level prosodic features, accuracy of emotion recognition using class-level spectral features increases with the utterance length.
We also introduce a novel emotion recognition approach which integrates ranking models. The approach is speaker independent, yet it is designed to exploit information from utterances from the same speaker in the test set before making predictions. It achieves much higher precision in identifying emotional utterances than a conventional SVM classifier. Furthermore we test several possibilities for combining conventional classification and predictions based on ranking. All combinations improve overall prediction accuracy.
This is joint work with Houwei Cao, Dmitri Bitouk and Ragini Verma
Biography: Ani Nenkova is an Assistant Professor of Computer and Information Science at the University of Pennsylvania. Her main areas of research are automatic summarization, discourse, and text quality. She obtained her PhD degree in Computer Science from Columbia University in 2006. She also spent a year and a half as a postdoctoral fellow at Stanford University before joining Penn in Fall 2007.
Host: Chi-Chun Lee, Angeliki Metallinou, Prof. Shrikanth Narayanan
Location: Ronald Tutor Hall of Engineering (RTH) - 320
Audiences: Everyone Is Invited
Contact: Mary Francis