Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/9966
Title: Similarity based Chinese synonym collocation extraction
Authors: Li, W
Lu, Q 
Xu, R
Keywords: Lexical Statistics
Synonymous Collocations
Similarity
Semantic Information
Issue Date: 2005
Publisher: Citeseer
Source: Computational linguistics and Chinese language processing, 2005, v. 10, no. 1, p. 123-144 How to cite?
Journal: Computational linguistics and Chinese language processing 
Abstract: Collocation extraction systems based on pure statistical methods suffer from two major problems. The first problem is their relatively low precision and recall rates. The second problem is their difficulty in dealing with sparse collocations. In order to improve performance, both statistical and lexicographic approaches should be considered. This paper presents a new method to extract synonymous collocations using semantic information. The semantic information is obtained by calculating similarities from HowNet. We have successfully extracted synonymous collocations which normally cannot be extracted using lexical statistics. Our evaluation conducted on a 60MB tagged corpus shows that we can extract synonymous collocations that occur with very low frequency and that the improvement in the recall rate is close to 100%. In addition, compared with a collocation extraction system based on the Xtract system for English, our algorithm can improve the precision rate by about 44%.
URI: http://hdl.handle.net/10397/9966
Appears in Collections:Journal/Magazine Article

Access
View full-text via PolyU eLinks SFX Query
Show full item record

Page view(s)

40
Last Week
0
Last month
Checked on Oct 15, 2017

Google ScholarTM

Check



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.