-
PhD Defense - Xing Shi
Tue, May 08, 2018 @ 10:00 AM - 12:00 PM
Thomas Lord Department of Computer Science
University Calendar
PhD Candidate: Xing Shi
Date: May 8, 10am at SAL 322
Committee: Kevin Knight (chair), Jonathan May and Shri Narayanan
Abstract:
Recurrent neural networks (RNN) have been successfully applied to various Natural Language Processing tasks, including language modeling, machine translation, text generation, etc. However, several obstacles still stand in the way: First, due to the RNN's distributional nature, few interpretations of its internal mechanism are obtained, and it remains a black box. Second, because of the large vocabulary sets involved, the text generation is very time-consuming. Third, there is no flexible way to constrain the generation of the sequence model with external knowledge. Last, huge training data must be collected to guarantee the performance of these neural models, whereas annotated data such as parallel data used in machine translation are expensive to obtain. This work aims to address the four challenges mentioned above.
To further understand the internal mechanism of the RNN, we choose neural machine translation (NMT) systems as a testbed. We first investigate how NMT outputs target strings of appropriate lengths, locating a collection of hidden units that learns to explicitly implement this functionality. Then we investigate whether NMT systems learn source language syntax as a by-product of training on string pairs. We find that both local and global syntactic information about source sentences is captured by the encoder. Different types of syntax are stored in different layers, with different concentration degrees.
To speed up text generation, we propose two novel GPU-based algorithms: 1) Utilize the source/target words alignment information to shrink the target side run-time vocabulary; 2) Apply locality sensitive hashing to find nearest word embeddings. Both methods lead to a 2-3x speedup on four translation tasks without hurting machine translation accuracy as measured by BLEU. Furthermore, we integrate a finite state acceptor into the neural sequence model during generation, providing a flexible way to constrain the output, and we successfully apply this to poem generation, in order to control the meter and rhyme.
To improve NMT performance on low-resource language pairs, we re-examine multiple technologies that are used in high resource language NMT and other NLP tasks, explore their variations and result in a strong NMT system for low resource languages. Experiments on Uygher-English shows a 10+ BLEU score improvement over the vanilla NMT system.
Location: Henry Salvatori Computer Science Center (SAL) - 322
Audiences: Everyone Is Invited
Contact: Lizsl De Leon