Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/28557
Title: One story, one flow : Hidden Markov Story Models for multilingual multidocument summarization
Authors: Fung, P
Ngai, G 
Keywords: Multilingual document summarization
Hidden Markov models
Issue Date: 2006
Source: ACM Transactions on speech and language processing, 2006, v. 3, no. 2, p. 1-16 How to cite?
Journal: ACM Transactions on speech and language processing 
Abstract: This article presents a multidocument, multilingual, theme-based summarization system based on modeling text cohesion (story flow). Conventional extractive summarization systems which pick out salient sentences to include in a summary often disregard any flow or sequence that might exist between these sentences. We argue that such inherent text cohesion exists and is (1) specific to a particular story and (2) specific to a particular language. Documents within the same story, and in the same language, share a common story flow, and this flow differs across stories, and across languages. We propose using Hidden Markov Models (HMMs) as story models. An unsupervised segmental K-means method is used to iteratively cluster multiple documents into different topics (stories) and learn the parameters of parallel Hidden Markov Story Models (HMSM), one for each story. We compare story models within and across stories and within and across languages (English and Chinese). The experimental results support our “one story, one flow” and “one language, one flow” hypotheses. We also propose a Naïve Bayes classifier for document summarization. The performance of our summarizer is superior to conventional methods that do not incorporate text cohesion information. Our HMSM method also provides a simple way to compile a single metasummary for multiple documents from individual summaries via state labeled sentences.
URI: http://hdl.handle.net/10397/28557
Appears in Collections:Journal/Magazine Article

Access
View full-text via PolyU eLinks SFX Query
Show full item record

Page view(s)

52
Last Week
4
Last month
Checked on Sep 25, 2017

Google ScholarTM

Check



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.