Conferences, Lectures, & Seminars
Events for August
-
NL Seminar-Improving machine translation from low resource languages
Fri, Aug 11, 2017 @ 03:00 PM - 04:00 PM
Information Sciences Institute
Conferences, Lectures, & Seminars
Speaker: Nima Pourdamghani, USC/ISI
Talk Title: Improving machine translation from low resource languages
Series: Natural Language Seminar
Abstract: Statistical machine translation MT often needs a large corpus of parallel translated sentences in order to achieve good performance. This limits the use of current MT technologies to a few resource rich languages. Assume an incident happens in an area with a low-resource language. For a quick response, we need to build an MT system with available data, as finding or translating new parallel data is expensive and time consuming. For many languages this means that we only have a small amount of often out-of-domain parallel data e.g. a Bible or Ubuntu manual. This talk is about ways to improve machine translation in low resource scenarios. I'll talk about use of monolingual data and parallel data from related languages to improve machine translation from the low resource language into English.
Biography: Nima Pourdamghani is a fourth year Ph.D. student at ISI. He works with Professor Kevin Knight on machine translation from low resource languages.
Host: Nima Pourdamghani
Location: Information Science Institute (ISI) - 6th Flr Conf Rm -# 689
Audiences: Everyone Is Invited
Contact: Peter Zamar
This event is open to all eligible individuals. USC Viterbi operates all of its activities consistent with the University's Notice of Non-Discrimination. Eligibility is not determined based on race, sex, ethnicity, sexual orientation, or any other prohibited factor. -
NL Seminar-Neural Creative Language Generation
Fri, Aug 18, 2017 @ 03:00 PM - 04:00 PM
Information Sciences Institute
Conferences, Lectures, & Seminars
Speaker: Marjan Ghazvininejad, USC/ISI
Talk Title: Neural Creative Language Generation
Series: Natural Language Seminar
Abstract: Natural language generation NLG is a well studied and still very challenging field in natural language processing. One of the less studied NLG tasks is the generation of creative texts such as jokes, puns, or poems. Multiple reasons contribute to the difficulty of research in this area. First, no immediate application exists for creative language generation. This has made the research on creative NLG extremely diverse, having different goals, assumptions, and constraints. Second, no quantitative measure exists for creative NLG tasks. Consequently, it is often difficult to tune the parameters of creative generation models and drive improvements to these systems. Finally, rule based systems for creative language generation are not yet combined with deep learning methods.
In this work, we address these challenges for poetry generation which is one of the main areas of creative language generation. We introduce password poems as a novel application for poetry generation. Furthermore, we combine finite-state machinery with deep learning models in a system for generating poems for any given topic. We introduce a quantitative metric for evaluating the generated poems and build the first interactive poetry generation system that enables users to revise system generated poems by adjusting style configuration settings like alliteration, concreteness and the sentiment of the poem.
In order to improve the poetry generation system, we decide to borrow ideas from human literature and develop a poetry translation system. We propose to study human poetry translation and measure the language variation in this process. we will study how human poetry translation is different from human translation in general and whether a translator translates poetry more freely. Then we will use our findings to develop a machine translation system specifically for translating poetry and proposing metrics for evaluating the quality of poetry translation.
Biography: Marjan Ghazvininejad is a PhD student at ISI working with Professor Kevin Knight.
Host: Kevin Knight
More Info: http://nlg.isi.edu/nl-seminar/
Location: Information Science Institute (ISI) - 11th Flr Conf Rm # 1135, Marina Del Rey
Audiences: Everyone Is Invited
Contact: Peter Zamar
Event Link: http://nlg.isi.edu/nl-seminar/
This event is open to all eligible individuals. USC Viterbi operates all of its activities consistent with the University's Notice of Non-Discrimination. Eligibility is not determined based on race, sex, ethnicity, sexual orientation, or any other prohibited factor. -
THURSDAY TALKS: NL Seminars-1 Recurrent Neural Networks as Weighted Language Recognizers 2 Gloss-to-English: Improving Low Resource Language Translation Using Alignment Tables
Thu, Aug 31, 2017 @ 03:00 PM - 04:00 PM
Information Sciences Institute
Conferences, Lectures, & Seminars
Speaker: Yining Chen and Sasha Mayn , USC/ISI Interns
Talk Title: THURSDAY TALKS: 1 Recurrent Neural Networks as Weighted Language Recognizers 2 Gloss-to-English: Improving Low Resource Language Translation Using Alignment Tables
Series: Natural Language Seminar
Abstract: 1. We investigate properties of a simple recurrent neural network RNN as a formal device for recognizing weighted languages. We focus on the single layer, ReLU activation, rational weight RNN with softmax, a standard form of RNN used in language processing applications. We prove that many questions one may ask about such RNNs are undecidable, including consistency, equivalence, minimization, and finding the highest weighted string. For consistent RNNs, finding the highest weighted string is decidable, although the solution can be exponentially long in the length of the input RNN encoded in binary. Limiting to solutions of polynomial length, we prove that finding the highest-weighted string for a consistent RNN is NP complete and APX hard.
2. Neural Machine Translation has gained popularity in recent years and has been able to achieve impressive results. The only caveat is that millions of parallel sentences are needed in order to train the system properly, and in a low resource scenario that amount of data simply may not be available. This talk will discuss strategies for addressing the data scarcity problem, particularly using alignment tables to make use of parallel data from higher resource language pairs and creating synthetic in domain data.
Biography: Yining Chen is a third year undergraduate student at Dartmouth College. She is a summer intern at ISI working with Professor Kevin Knight and Professor Jonathan May.
Sasha Mayn is a summer intern for the ISI Natural Language Group. She is particularly interested in machine translation and language generation. Last summer Sasha interned at the PanLex Project in Berkeley, where she was responsible for preprocessing digital dictionaries and entering them into PanLex's multilingual database. This summer she has been working on improving neural machine translation strategies for low resource languages under the supervision of Jon May and Kevin Knight.
Host: Marjan Ghazvininejad and Kevin Knight
More Info: http://nlg.isi.edu/nl-seminar/
Location: Information Science Institute (ISI) - 11th Flr Conf Rm # 1135, Marina Del Rey
Audiences: Everyone Is Invited
Contact: Peter Zamar
Event Link: http://nlg.isi.edu/nl-seminar/
This event is open to all eligible individuals. USC Viterbi operates all of its activities consistent with the University's Notice of Non-Discrimination. Eligibility is not determined based on race, sex, ethnicity, sexual orientation, or any other prohibited factor.