Key phrase aware transformer for abstractive summarization

Liu, S; Cao, J; Yang, R; Wen, Z

doi:10.1016/j.ipm.2022.102913

Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/95764

Title:	Key phrase aware transformer for abstractive summarization
Authors:	Liu, S Cao, J Yang, R Wen, Z
Issue Date:	May-2022
Source:	Information processing and management, May 2022, v. 59, no. 3, 102913
Abstract:	Abstractive summarization aims to generate a concise summary covering salient content from single or multiple text documents. Many recent abstractive summarization methods are built on the transformer model to capture long-range dependencies in the input text and achieve parallelization. In the transformer encoder, calculating attention weights is a crucial step for encoding input documents. Input documents usually contain some key phrases conveying salient information, and it is important to encode these phrases completely. However, existing transformer-based summarization works did not consider key phrases in input when determining attention weights. Consequently, some of the tokens within key phrases only receive small attention weights, which is not conducive to encoding the semantic information of input documents. In this paper, we introduce some prior knowledge of key phrases into the transformer-based summarization model and guide the model to encode key phrases. For the contextual representation of each token in the key phrase, we assume the tokens within the same key phrase make larger contributions compared with other tokens in the input sequence. Based on this assumption, we propose the Key Phrase Aware Transformer (KPAT), a model with the highlighting mechanism in the encoder to assign greater attention weights for tokens within key phrases. Specifically, we first extract key phrases from the input document and score the phrases’ importance. Then we build the block diagonal highlighting matrix to indicate these phrases’ importance scores and positions. To combine self-attention weights with key phrases’ importance scores, we design two structures of highlighting attention for each head and the multi-head highlighting attention. Experimental results on two datasets (Multi-News and PubMed) from different summarization tasks and domains show that our KPAT model significantly outperforms advanced summarization baselines. We conduct more experiments to analyze the impact of each part of our model on the summarization performance and verify the effectiveness of our proposed highlighting mechanism.
Keywords:	Abstractive summarization Deep learning Key phrase extraction Text summarization
Publisher:	Pergamon Press
Journal:	Information processing and management
ISSN:	0306-4573
DOI:	10.1016/j.ipm.2022.102913
Rights:	© 2022 Elsevier Ltd. All rights reserved. The following publication Liu, S., Cao, J., Yang, R., & Wen, Z. (2022). Key phrase aware transformer for abstractive summarization. Information Processing & Management, 59(3), 102913 is available at https://dx.doi.org/10.1016/j.ipm.2022.102913.
Appears in Collections:	Journal/Magazine Article

Files in This Item:

File	Description	Size	Format
Liu_Multi-Head_Abstractive_Summarization.pdf	Preprint version	1.57 MB	Adobe PDF	View/Open

Open Access Information

Status	open access
File Version	Author’s Original

Access

View full-text via PolyU eLinks

Show full item record

Page views

59

Last Week
1

Last month

Citations as of May 19, 2024

Downloads

80

Citations as of May 19, 2024

SCOPUS^TM
Citations

18

Citations as of May 16, 2024

WEB OF SCIENCE^TM
Citations

12

Citations as of May 16, 2024

Google Scholar^TM

Check

Files in This Item:

Open Access Information

Access

Page views

Downloads

SCOPUSTM Citations

WEB OF SCIENCETM Citations

Google ScholarTM

Altmetric

SCOPUS^TM
Citations

WEB OF SCIENCE^TM
Citations

Google Scholar^TM