Evaluating latent representations in encoder-decoder models for text summarization with multitask and transfer learning.


Neural encoder-decoder models have shown significant promise in sequence transduc- tion tasks such as machine translation and, more recently, text summarization. Such models rely on a latent space representation of the source sequences, learned by the encoder, to pass to a decoder for generation of a target sequence. Understanding and evaluating what information the latent representations learn remains a significant chal- lenge. Multitask learning aims to solve several tasks simultaneously using learned representations that are shared across different tasks. Our research investigates learned latent space representations of the encoder in the context of different sequence trans- duction tasks and multitask learning. We make use of a novel dataset of news articles from the Guardian newspaper, which are accompanied by metadata including short summaries and topic tag sequences for articles. We train separate encoder-decoder re- current neural network models to generate (a) abstractive text summaries of the articles and (b) topic tag sequences related to the article content. We first establish high quality benchmarks in the new dataset and on the new task. We then perform experiments using our models to manipulate the latent representations learned by the models using multitask learning. We train an encoder-dual-decoder model to perform both sum- marization and tag sequence generation simultaneously. Whilst performance of the single task models is good the multitask model fails to learn to generate high quality sequences. We evaluate learned representations using transfer learning with a semantic classification task. We show that the tag sequence generation model learns represen- tations that are more useful for the semantic classification side task and by training a summarization model with a multitask objective we induce a similar performance increase on the side task.