USC - Viterbi School of Engineering

Mar
03

Multimodal Signal Processing: Signals from, to, and for humans
Thu, Mar 03, 2011 @ 10:30 AM - 12:30 PM
Ming Hsieh Department of Electrical and Computer Engineering
Conferences, Lectures, & Seminars

Speaker: Panayiotis (Panos) Georgiou, University of Southern California

Talk Title: Multimodal Signal Processing: Signals from, to, and for humans

Abstract: The 90's saw an explosion of ideas in merging traditional signal processing techniques with personal communication and entertainment supported by www technologies. We are presently experiencing yet another paradigm change in human interaction and communication such as through social media and in online information sharing. Notably, there has been significant movement in employing information and communications technologies towards transforming access and participation of people in their health and well-being.

My research lies in the exciting convergence of signal processing, multimedia, and speech applications centered on novel processing of signals from, to, and for humans. This effort entails a range of challenges in the sensing, recognition, interpretation, and context exploitation of complex human behavior, both at the explicit and implicit levels. Importantly, the effort includes the creation of algorithms and models that are inspired by, and emulate, how humans make use of the behavioral signal information in specific, societally-meaningful application settings.

In this talk, using specific examples, I will focus on two aspects of my work that aim at capturing an exploiting human interaction and their environment in a context aware way: (1) The convergence of multimodal signal processing and evidence based assessment in observational practice in mental health. Specifically I will discuss our recent efforts in instrumenting, collecting, and analyzing multimodal data for assessing behavioral cues relevant to the field of family psychology. The approach relies on array signal processing and machine learning techniques based on training data labeled by domain experts. We exploit both existing data and pursue new multimodal data acquisition approaches.

(2) The inherently rich nature of the human communication channel raises interesting challenges when one or more aspects are compromised due to human or environmental factors. We have been developing speech-to-speech translation technologies especially targeting cross-lingual/cross-cultural urban healthcare settings. Many open questions remain including what information is relevant and how it needs to be captured and transferred from source to target (e.g. lexical and paralinguistic) and how conceptual information encoded in the speech signal can be modeled in a communication-channel framework. I will highlight some of the advances and open questions in these two domains.

Biography: Panayiotis G. Georgiou received his B.A. and M.Eng degrees with Honors from Cambridge University (Pembroke College), U.K. in 1996. He received his MSc and PhD degrees from the University of Southern California in 1998 and 2002 respectively. During the period 1992-96 he was awarded a Commonwealth scholarship from Cambridge-Commonwealth Trust.

Since 2003 he has been a member of the Speech Analysis and Interpretation Lab, first as a Research Associate and currently as a Research Assistant Professor. His interests span the fields of Human Social and Cognitive Signal Processing. He has worked on and published over 70 papers in the fields of statistical signal processing, alpha stable distributions, speech and multimodal signal processing and interfaces, speech translation, language modeling, immersive sound processing, sound source localization, and speaker identification. He has been an Investigator, and co-PI on several federally funded projects notably including the DARPA Transtac âSpeechLinksâ and the NSF (Large) âAn Integrated Approach to Creating Enriched Speech Translation Systemsâ. He is currently serving as guest editor of the Computer Speech and Language journal. He has received best paper awards for his pioneering work in analyzing the multimodal behaviors of users in speech- to-speech translation and for automatic classification of married couplesâ behavior using audio features.

His current focus is on multimodal environments, behavioral signal processing, and speech-to-speech translation.

Host: Professor Richard Leahy

Location: Hughes Aircraft Electrical Engineering Center (EEB) -
Audiences: Everyone Is Invited

Contact: Talyia Veal

This event is open to all eligible individuals. USC Viterbi operates all of its activities consistent with the University's Notice of Non-Discrimination. Eligibility is not determined based on race, sex, ethnicity, sexual orientation, or any other prohibited factor.
Add to Google Calendar

Return to Calendar

Events Calendar

Multimodal Signal Processing: Signals from, to, and for humans