Model copying and rewriting in neural abstractive summarization

Cao, Ziqiang

Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/85965

Title:	Model copying and rewriting in neural abstractive summarization
Authors:	Cao, Ziqiang
Degree:	Ph.D.
Issue Date:	2018
Abstract:	Copying and Rewriting are two core writing behaviors in human summarization. Traditional automatic summarization approaches basically follow these two styles. For example, extractive summarization copies source sentences, compressive summarization copies source words, and template-based summarization utilizes handcrafted rules to rewrite from pre-defned templates. Since 2016, sequence-to-sequence (seq2seq) neural networks have attracted increasing attention from abstractive summarization researchers. Compared with traditional summarization approaches, seq2seq models generate summaries end-to-end and require less human efforts. However, most existing seq2seq approaches focus more on how to learn to generate the summary text, but overlook the previously mentioned two essential summarization skills, i.e., copying and rewriting. These approaches suffer from two major problems. First, summarization has to start almost from scratch, discarding the prior knowledge accumulated during the past half a century research. The data scale thus becomes the most signifcant bottleneck for performance improvement. Second, the neural network architecture lacks explanation and is hard to evaluate. To address these problems, we explore to explicitly model copying and rewriting in seq2seq summarization by utilizing the prior knowledge learned from traditional summarization approaches. Our research consists of three parts. In the work to be presented in Chapter 3, we leverage the popular attention mechanism to copy and rewrite words in the source text. Our model fuses a copying decoder and a rewriting decoder. The copying decoder finds out words to be copied in the source text based on learned attentions. The rewriting decoder produces other necessary summary words limited in the source-specifc vocabulary, which is also derived from the attention mechanism. Extensive experiments show that our model is able to generate informative summaries effciently. In Chapter 4, we investigate an important but neglected problem, i.e., the faithfulness problem in abstractive summarization. Abstractive summarization has to fuse different parts of the source text, which inclines to create fake facts. We call this issue summary faithfulness. Our preliminary study reveals nearly one third of the outputs from a state-of-the-art neural abstractive summarization system suffer from fake generation. To copy facts in the source text, we leverage open information extraction and dependency parsing techniques to extract true facts from the source text. Note that these techniques are also widely-used in compressive summarization. We propose a dual-attention seq2seq summarization model to force the summary generation conditioned on both the source text and the extracted facts. Experiments demonstrate that our model greatly reduces fake summaries by 55%, and at the same time achieves signifcant improvement on informativeness. Inspired by template-based summarization, we propose to use existing summaries as soft templates to guide the seq2seq model, which will be elaborated in Chapter 5. To this end, we use a popular information retrieval tool to retrieve appropriate existing summaries as candidate templates. We extend the seq2seq model by jointly learning template reranking and template-aware summary generation. Essentially, the model learns to rewrite the selected template (i.e., the summary pattern) according to the source text. Experiments show that our model signifcantly outperforms the state-of-the-art methods in terms of informativeness, and even soft templates themselves demonstrate high competitiveness. More importantly, the import of high-quality "external" summaries improves the stability and readability of output summaries and provides potential in generation diversity. As one of the few large-scale studies of copying and rewriting in seq2seq models, our work is expected to advance a more in-depth research in core writing behavior driven neural abstractive summarization.
Subjects:	Hong Kong Polytechnic University -- Dissertations Automatic abstracting Computational linguistics
Pages:	xiii, 148 pages : color illustrations
Appears in Collections:	Thesis

Access

View full-text via https://theses.lib.polyu.edu.hk/handle/200/9760

Show full item record

Page views

180

Last Week
3

Last month

Citations as of Oct 5, 2025

Google Scholar^TM

Check

Access

Page views

Google ScholarTM

Google Scholar^TM