USC - Viterbi School of Engineering

Aug
17

NL Seminar-Decipherment for Universal Language Tools A case study for Unsupervised Part of Speech Induction
Fri, Aug 17, 2018 @ 03:00 PM - 04:00 PM
Information Sciences Institute
Conferences, Lectures, & Seminars

Speaker: Ronald Cardenas, USC

Talk Title: Decipherment for Universal Language Tools A case study for Unsupervised Part of Speech Induction

Series: Natural Language Seminar

Abstract: Unsupervised Part of Speech induction can be viewed as a two-steps task. The first step infers a sequence of states, while the second step maps this sequence to an actual Part-of-Speech sequence at training or testing time. Hence, this last step requires reference tagged data, a luxury low-resource target languages might not have. In this talk, we present an alternative approach to the second step, modeling it as a decipherment problem in which the ciphered text is the sequence of states and the original text we want to recover is the POS sequence. This approach requires no reference data in the target language and allows to leverage POS sequences in much richer languages. Our experiments show that our approach benefits the most from simple strategies for inferring state sequences, such as Brown clustering. This allow our method to obtain reasonable performance in low-resource and limited-time scenarios.

Biography: Ronald Cardenas is a Master's student in the Language and Communication Technologies programme at Charles University in Prague. His research interests span morphological analysis and parsing of low-resource languages. At ISI, he works with Jonatan May on developing universal language tools.

Host: Nanyun Peng

More Info: http://nlg.isi.edu/nl-seminar/

Location: Information Science Institute (ISI) - 11th Flr Conf Rm # 1135, Marina Del Rey
Audiences: Everyone Is Invited

Contact: Peter Zamar

Event Link: http://nlg.isi.edu/nl-seminar/

This event is open to all eligible individuals. USC Viterbi operates all of its activities consistent with the University's Notice of Non-Discrimination. Eligibility is not determined based on race, sex, ethnicity, sexual orientation, or any other prohibited factor.
Add to Google Calendar

Return to Calendar

Events Calendar

NL Seminar-Decipherment for Universal Language Tools A case study for Unsupervised Part of Speech Induction