USC - Viterbi School of Engineering

Subscribe:
Login

Select a calendar:

Filter July Events by Event Type:

<< Previous Month

Next Month >>

SUNMONTUEWEDTHUFRISAT

View Events for:
Today
The Week
The Month

Events for July 19, 2013

Jul
19

AI Seminar- Martha Palmer: "Annotating Resources for the Clinical Domain"
Fri, Jul 19, 2013 @ 11:00 AM - 12:00 PM
Information Sciences Institute
Conferences, Lectures, & Seminars

Speaker: Martha Palmer, University of Colorado, Boulder

Talk Title: "Annotating Resources for the Clinical Domain"

Series: Artificial Intelligence Seminar

Abstract: In the general domain, large-scale linguistic annotation of syntactic structure and semantic labels fostered truly revolutionary advances in natural language processing systems. The availability of a similar large annotated resource for clinical language would enable equivalent progress in this domain by advancing methods development through rule-based and statistical approaches, involving a larger research community in the study of difficult NLP problems, and porting best-of-breed methodologies to healthcare.

Under the Strategic Health Advanced Research Project Area 4 (SHARP 4; www.sharpn.org) and the THYME NIH grant (1 R01 LM010090-01A1, Temporal Relation Discovery for Clinical Text, PI: Savova) 500,000 tokens of clinical narrative spread across specialties, patients, notes types and three sites (Mayo Clinic, Seattle Group Health Cooperative and Intermountain Health Care) are being annotated. Linguistic annotations comprise constituency parses, dependency parses, semantic role labels, coreference and temporal relations and are being done at the University of Colorado. In addition, domain specific entity and relation annotations are being done following UMLS guidelines jointly between Colorado, Harvard and the Mayo Clinic.
There are many challenges in porting general domain annotation schemes to the clinical domain, due to the fragmentary, informal style of the text and the domain specific terminology. It is also important to create diverse annotation datasets and to explore more efficient methodologies for porting to new domains, such as active learning. This talk will describe the status of the annotation effort, how the challenges are being addressed, the performance improvements observed in the newly trained components, and experiments with active learning for smart data selection.

Biography: Martha Palmer is a Full Professor at the University of Colorado with joint appointments in Linguistics and Computer Science and is an Institute of Cognitive Science Faculty Fellow. She recently won a Boulder Faculty Assembly 2010 Research Award and was the Director of the 2011 Linguistics Institute in Boulder, CO. Her research has been focused on trying to capture elements of the meanings of words that can comprise automatic representations of complex sentences and documents. Supervised machine learning techniques rely on vast amounts of annotated training data so she and her students are engaged in providing data with word sense tags and semantic role labels for English, Chinese, Arabic, Hindi, and Urdu, funded by DARPA, and NSF. They also train automatic sense taggers and semantic role labelers, and extract bilingual lexicons from parallel corpora. A more recent focus is the application of these methods to biomedical journal articles and clinical notes, funded by NIH. She is a co-editor for the Journal of Natural Language Engineering and for LiLT, Linguistic Issues in Language Technology, and on the CLJ Editorial Board. She is a past President of the Association for Computational Linguistics, past Chair of SIGLEX and SIGHAN.

Host: David Chiang

More Info: http://webcasterms1.isi.edu/mediasite/SilverlightPlayer/Default.aspx?peid=ad22eb4390944a439d0e1eeada255aa21d

Webcast: TBA
Location: Information Science Institute (ISI) - 11th Flr Conf Rm # 1135
WebCast Link: TBA
Audiences: Everyone Is Invited

Contact: Peter Zamar

Event Link: http://webcasterms1.isi.edu/mediasite/SilverlightPlayer/Default.aspx?peid=ad22eb4390944a439d0e1eeada255aa21d

This event is open to all eligible individuals. USC Viterbi operates all of its activities consistent with the University's Notice of Non-Discrimination. Eligibility is not determined based on race, sex, ethnicity, sexual orientation, or any other prohibited factor.
Add to Google Calendar
Jul
19

NL Seminar- Jackie Lee: "Bayesian Approaches to Acoustic Model and Pronunciation Lexicon Discovery"
Fri, Jul 19, 2013 @ 03:00 PM - 04:00 PM
Information Sciences Institute
Conferences, Lectures, & Seminars

Speaker: Jackie Lee, MIT

Talk Title: "Bayesian Approaches to Acoustic Model and Pronunciation Lexicon Discovery"

Series: Natural Language Seminar

Abstract: In the first part of the talk, we investigate the problem of acoustic modeling in which prior language-specific knowledge and transcribed data are unavailable. We present an unsupervised model that simultaneously segments the speech, discovers a proper set of sub-word units (e.g., phones) and learns a Hidden Markov Model (HMM) for each induced acoustic unit. Our approach is formulated as a Dirichlet process mixture model in which each mixture is an HMM that represents a sub-word unit. We apply our model to the TIMIT corpus, and the results demonstrate that our model discovers phone units that are highly correlated with English phones as well as produces better segmentation than the state-of-the-art baselines. We test the quality of the learned acoustic models on a spoken term detection task. Compared to the baseline, our model is able to improve the detection precision of top hits by a large margin.

The creation of a pronunciation lexicon remains the most inefficient process in developing an automatic speech recognizer. In the second part of the talk, we discuss an unsupervised alternative to the conventional manual approach for creating pronunciation dictionaries. We present a hierarchical Bayesian model, which jointly discovers the phonetic inventory and the Letter-to-Sound (L2S) mapping rules in a language using only transcribed data. When tested on a corpus of spontaneous queries, our results demonstrate the superiority of the proposed joint learning scheme over its sequential counterpart, in which the latent phonetic inventory and L2S mappings are learned separately. Furthermore, the recognizers built with the automatically induced lexicon consistently outperform grapheme-based recognizers and even approach the performance of recognition systems trained using conventional supervised procedures.

Biography: http://groups.csail.mit.edu/sls/people/clee.shtml

Host: Qing Dou

More Info: http://nlg.isi.edu/nl-seminar/

Location: Information Science Institute (ISI) - 11th Flr Conf Rm # 1135, Marina Del Rey
Audiences: Everyone Is Invited

Contact: Peter Zamar

Event Link: http://nlg.isi.edu/nl-seminar/

This event is open to all eligible individuals. USC Viterbi operates all of its activities consistent with the University's Notice of Non-Discrimination. Eligibility is not determined based on race, sex, ethnicity, sexual orientation, or any other prohibited factor.
Add to Google Calendar

Events Calendar

Select a calendar:

Filter July Events by Event Type:

Events for July 19, 2013

AI Seminar- Martha Palmer: "Annotating Resources for the Clinical Domain"

NL Seminar- Jackie Lee: "Bayesian Approaches to Acoustic Model and Pronunciation Lexicon Discovery"