Back to results list
Show full item record
Please use this identifier to cite or link to this item:
|Title:||An integrated summarization framework with hierarchical content representation||Authors:||Ouyang, You||Degree:||Ph.D.||Issue Date:||2011||Abstract:||With the rapid growth of Internet services, more and more electronic text is accessible on-line. While the abundance of information provides more resources for individuals, it also results in the well-recognized information overload problem -- the excessive amount of information being provided. The technology of automatic text summarization has emerged to deal with this problem. Automatic text summarization is the process of creating a shortened version of text by computational techniques to help users catch the important content of the original text(s) with affordable time costs. According to the ways of summary composition, there are extractive summarization methods and abstractive summarization methods. Currently, extractive methods are the mainstream, which will be the focus in this dissertation. The main question to be answered in extractive summarization is how to select a set of sentences from the input documents to form a summary that can best convey the important content of the input documents. Setting off by discovering important words in the input documents to answer the question, we propose several content models for word saliency estimation and word-based sentence ranking and then develop two word-based summarization methods with the content models. Experimental results prove the effectiveness of the proposed methods applied to several authoritative data sets from the Document Understanding Conference (DUC) tasks. Our next target is to incorporate the relations between important words into the summarization process. We propose several methods to identify the latent word relations in the input documents and use them to obtain a hierarchical representation of the document content. Based on the hierarchical content representation, we propose a novel hierarchical summarization method that follows the general-to-specific style to extract summary sentences. Unsystematically studied in previous researches, hierarchical summarization is characterized by integrating various summarization objectives to simultaneously improve the content and readability of the composed summaries. The experimental results on the DUC data sets prove the advantages of the proposed method over traditional summarization methods. Finally, we conduct several tentative studies to examine the use of more sophisticated content representations beyond single words for improving the hierarchical summarization method. The tentative studies capture several important details in developing good hierarchical summarization methods and shed light on the directions of future work in hierarchical summarization.||Subjects:||Automatic abstracting.
Hong Kong Polytechnic University -- Dissertations
|Pages:||xiii, 172 p. : ill. ; 30 cm.|
|Appears in Collections:||Thesis|
View full-text via https://theses.lib.polyu.edu.hk/handle/200/6279
Citations as of May 15, 2022
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.