-
Sparse Representation Methods for Speech and Language Processing
Mon, Apr 12, 2010 @ 10:00 AM - 11:00 AM
Ming Hsieh Department of Electrical and Computer Engineering
Conferences, Lectures, & Seminars
Abstract:
Sparse representation techniques, such as Support Vector Machines (SVMs), k-nearest neighbor (kNN) and Bayesian Compressive Sensing (BCS), can be used to characterize a test sample from a few support training samples in a dictionary set. Traditional Compressing Sensing based methods have been used for signal reconstruction and compression. They have also been successfully applied to the classification of fMRI images. This talk presents our recent work on sparse representations for phonetic classification, speech recognition and text classification in general. The importance of a prior, the sparseness constraint, and choice of the dictionary to this framework will be discussed. Representing a test example as a linear combination of features from the training set allows for a PER on the well-studied TIMIT phone recognition task of 19.0%, which is the best number reported in the literature to-date. Motivated by this result, we also propose a set of features that are a function of the phonetic labels of the original dictionary can be used to create a new representation of the test sample, where the test sample is better linked to the actual units/labels to be recognized.
(Joint work with Tara Sainath, Dimitri Kanevsky and David Nahamoo.)Bio:
Dr. Bhuvana Ramabhadran is the Manager of the Speech Transcription and Synthesis Research Group at the IBM T.J. Watson Center. Upon joining IBM in 1995, she made significant contributions to the ViaVoice line of products focusing on acoustic modeling. She has served as the Principal Investigator of two major international projects: the NSF-sponsored MALACH project, developing algorithms for transcription of elderly, accented speech from Holocaust survivors, and the EUsponsored TC-STAR project, developing algorithms for recognition of EU parliamentary speeches. She has served as the technical chair on conferences, organized workshops, and currently serves on the Speech and Language Technical Committee of the IEEE SPS society. Her research interests include speech recognition algorithms, statistical signal processing, pattern recognition and biomedical engineering.Host: Professor Shrikanth NarayananLocation: Ronald Tutor Hall of Engineering (RTH) - 320
Audiences: Everyone Is Invited
Contact: Mary Francis