Logo: University of Southern California

Events Calendar

  • NL Seminar-Decipherment for Universal Language Tools A case study for Unsupervised Part of Speech Induction

    Fri, Aug 17, 2018 @ 03:00 PM - 04:00 PM

    Information Sciences Institute

    Conferences, Lectures, & Seminars

    Speaker: Ronald Cardenas, USC

    Talk Title: Decipherment for Universal Language Tools A case study for Unsupervised Part of Speech Induction

    Series: Natural Language Seminar

    Abstract: Unsupervised Part of Speech induction can be viewed as a two-steps task. The first step infers a sequence of states, while the second step maps this sequence to an actual Part-of-Speech sequence at training or testing time. Hence, this last step requires reference tagged data, a luxury low-resource target languages might not have. In this talk, we present an alternative approach to the second step, modeling it as a decipherment problem in which the ciphered text is the sequence of states and the original text we want to recover is the POS sequence. This approach requires no reference data in the target language and allows to leverage POS sequences in much richer languages. Our experiments show that our approach benefits the most from simple strategies for inferring state sequences, such as Brown clustering. This allow our method to obtain reasonable performance in low-resource and limited-time scenarios.

    Biography: Ronald Cardenas is a Master\'s student in the Language and Communication Technologies programme at Charles University in Prague. His research interests span morphological analysis and parsing of low-resource languages. At ISI, he works with Jonatan May on developing universal language tools.

    Host: Nanyun Peng

    More Info: http://nlg.isi.edu/nl-seminar/

    Location: Information Science Institute (ISI) - 11th Flr Conf Rm # 1135, Marina Del Rey

    Audiences: Everyone Is Invited

    Contact: Peter Zamar

    Event Link: http://nlg.isi.edu/nl-seminar/


Return to Calendar