Thu, Sep 05, 2019 @ 11:00 AM - 12:00 PM
Information Sciences Institute
Conferences, Lectures, & Seminars
Speaker: Denis Emelin and Prince Wang, USC/ISI
Talk Title: More than the sum of their parts: Translating idioms without destroying their meaning
Series: Natural Language Seminar
Abstract: Translating idioms is hard. As low-frequency linguistic events with a non compositional meaning, idiomatic expressions are at odds with contemporary neural machine translation methods. Accordingly, the literal translation of idiomatic phrases which fails to preserve their semantic content represents an often observed failure case in NMT models. To facilitate future work on idiom translation, the current project sets out to compile a large-coverage, multilingual corpus of parallel sentences containing idiomatic expressions, augmented with their respective monolingual definitions. With this resource in hand, we next aim to propose models which can effectively exploit idiom definitions to avoid literal translation errors. As part of the evaluation of the constructed corpus, we demonstrate that idioms continue to pose a veritable challenge for state of the art NMT models.
Biography: Denis is a second-year PhD candidate at the University of Edinburgh, advised by Dr. Rico Sennrich. His background is in machine translation, natural language understanding, and linguistics.
Host: Emily Sheng
More Info: https://nlg.isi.edu/nl-seminar
WebCast Link: https://bluejeans.com/s/8Lu7w/
Audiences: Everyone Is Invited
Contact: Peter Zamar