  Modeling Speech Production: From MRI Data to Articulatory Gestures

    Thu, Nov 17, 2016 @ 01:00 PM - 02:00 PM

    Ming Hsieh Department of Electrical and Computer Engineering

    Speaker: Dr. Asterios Toutios, Research Associate/USC

    Talk Title: Modeling Speech Production: From MRI Data to Articulatory Gestures

    Abstract: Novel technologies for imaging the vocal tract, such as real-time MRI, offer extraordinary opportunities for moving speech production research forward. A long-term goal of my research is to develop a modular architecture for synthesizing personalized, highly intelligible and natural-sounding speech, by combining vocal-tract imaging with mathematical modeling and linguistic knowledge. My approach is to model direct observations of the time-varying changes in vocal-tract shaping, in order to derive functional mappings from linguistic structures to synthesized vocal-tract dynamics, which will then drive a realistic simulation of the formation of speech acoustics by the dynamically changing vocal tract. Such an effort may have important technological impact, and validate ample scientific knowledge on the mechanisms of human speech production. In this talk, I will discuss a framework for deriving from real-time MRI data the spatiotemporal deployment of articulatory gestures (which may be viewed as linguistic, cognitive, or motor control targets) in fluent speech and in a speaker-specific manner. The framework includes: automatic segmentation of articulators in real-time MRI videos; the derivation of a guided factor analysis model of the vocal-tract geometry; a locally-linear mapping between deformations of articulators and vocal-tract constrictions; and the application of a novel convolutive non-negative matrix factorization algorithm.

    Biography: Asterios Toutios is a research associate with the Signal Analysis and Interpretation Laboratory (SAIL) at USC, where he leads and coordinates the Speech Production and Articulation kNowledge (SPAN) group. His main research interest is modeling human speech production on the basis of direct observations of the vocal-tract dynamic configuration, with a view to informing and enhancing speech technologies like synthesis, recognition, and speaker identification. He received his academic degrees in Thessaloniki, Greece: Diploma/MEng in Electrical and Computer Engineering (1999, Aristotle University); MSc in Information Systems (2002, University of Macedonia); PhD in Applied Informatics (2007, University of Macedonia). Next, he occupied postdoctoral research positions in France, at LORIA and TELECOM ParisTech, before moving to Southern California in June 2012. He has authored or co-authored more than 40 peer-reviewed publications in journals and international conferences. He has also translated from English to Greek a book on mathematical finance, published a few poems, and sung for a little-known alternative rock band.

    Host: Dr. Sandeep K. Gupta

    Location: Hughes Aircraft Electrical Engineering Center (EEB) - 248

