Mon, Jun 14, 2021 @ 01:30 PM - 03:00 PM
PhD Candidate: Xusen Yin
Date: June 14, 2021
Generalized Sequential Decision-Making via Language
Committee: Jon May (Chair), Sven Koenig, Shri Narayanan
Many interactive applications, e.g., negotiation, game-playing, personal assistants, and online customer service, require sequential decision-making over natural language communications, a novel but fast-growing domain in Natural Language Processing (NLP).
Unlike other NLP tasks that deal with single sentences or documents---e.g., question answering, machine translation, and sentiment analysis---sequential decision making requires language understanding and inference over sequences of sentences or documents. Moreover, there is usually no direct training objective for these applications compared to typical machine learning tasks. Thus, these tasks require a search in a decision-making space.
Deep Reinforcement Learning (RL) is a common choice for these tasks without explicit or direct targets. It ordinarily needs many iterations to get close to an actual target due to the demand of exploration in the tremendous search space induced by near-infinite natural language responses in dialogue. These explorations usually are composed of many random movements, especially in the initial RL training stage.
However, when placed in an unfamiliar environment, humans know how to solve new problems by applying existing knowledge and skills rather than working randomly. In contrast, computer agents struggle in these new scenarios due to overfitting and lack of common sense.
Can we generalize sequential decision-making agents to novel, even unrelated tasks under the medium of language? We show how to train agents that take beneficial decision sequences from experience and external knowledge for better generalization than standard RL algorithms, using text-based games as a demo environment. We find out that proper dialogue encoding helps the intent understanding, that turning instance knowledge into universal knowledge helps in-domain generalization, that large language models can provide external knowledge rather than learning everything from scratch. Finally, we show that fine-tuned large language models with decision-making ability from one domain can guide RL algorithms towards better exploration and generalization for cross-domain transfer.
Audiences: Everyone Is Invited
Contact: Lizsl De Leon