A hybrid extraction model for Chinese noun/verb synonym bi-gram

Li, W; Lu, Q

Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/5202

DC Field	Value	Language
dc.contributor	Department of Computing	-
dc.creator	Li, W	-
dc.creator	Lu, Q	-
dc.date.accessioned	2014-12-11T08:26:11Z	-
dc.date.available	2014-12-11T08:26:11Z	-
dc.identifier.isbn	978-4-905166-02-3	-
dc.identifier.uri	http://hdl.handle.net/10397/5202	-
dc.language.iso	en	en_US
dc.publisher	Institute for Digital Enhancement of Cognitive Development, Waseda University	en_US
dc.rights	© 2011 The PACLIC 25 Organizing Committee and PACLIC Steering Committee	en_US
dc.rights	Copyright of contributed papers reserved by respective authors	en_US
dc.rights	Copyright 2011 by Wanyin Li, Qin Lu	en_US
dc.subject	Collocation extraction	en_US
dc.subject	Statistical model	en_US
dc.subject	Syntactic rules	en_US
dc.subject	Semantic relationship	en_US
dc.subject	Similarity calculation	en_US
dc.subject	HowNet	en_US
dc.title	A hybrid extraction model for Chinese noun/verb synonym bi-gram	en_US
dc.type	Conference Paper	en_US
dcterms.abstract	Statistical-based collocation extraction approaches suffer from (1) low precision rate because high co-occurrence bi-grams may be syntactically unrelated and are thus not true collocations; (2) low recall rate because some true collocations with low occurrences cannot be identified successfully by statistical-based models. To integrate both syntactic rules as well as semantic knowledge into a statistical model for collocation extraction is one way to achieve a high precision while keeping a reasonable recall. This paper designs a cascade system which employs a hybrid model by integrating both syntactic and semantic knowledge into a statistical model for Chinese synonymous noun/verb collocations extraction. The grammatically bounded noun/verb collocations are extracted first from a syntactic-rule based module, which is then inputted to a semantic-based module for further retrieval of low frequent bi-gram collocations.	-
dcterms.accessRights	open access	en_US
dcterms.bibliographicCitation	Proceedings of the 25th Pacific Asia Conference on Language, Information and Computation (PACLIC 25), 16-18 Dec, Nanyang Technological University, Singapore, p. 430-439	-
dcterms.issued	2011-12-16	-
dc.identifier.scopus	2-s2.0-84863869937	-
dc.identifier.rosgroupid	r60560	-
dc.description.ros	2011-2012 > Academic research: refereed > Refereed conference paper	-
dc.description.oa	Version of Record	en_US
dc.identifier.FolderNumber	OA_IR/PIRA	en_US
dc.description.pubStatus	Published	en_US
dc.description.oaCategory	Copyright retained by author	en_US
Appears in Collections:	Conference Paper

Files in This Item:

File	Description	Size	Format
Li_Hybrid_Extraction_Bi-gram.pdf		135.34 kB	Adobe PDF	View/Open

Open Access Information

Status	open access
File Version	Version of Record

Access

View full-text via PolyU eLinks

Show simple item record

Page views

252

Last Week
1

Last month

Citations as of Feb 9, 2026

Downloads

189

Citations as of Feb 9, 2026

SCOPUS^TM
Citations

1

Last Week
0

Last month

Citations as of Aug 15, 2024

Google Scholar^TM

Check