Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/28986
Title: Building a Chinese collocation bank
Authors: Xu, R
Lu, Q 
Wong, KF
Li, W 
Keywords: Chinese collocation
Collocation bank
Collocation annotation
Collocation classification
Issue Date: 2009
Publisher: World Scientific Publishing Co
Source: International journal of computer processing of languages, 2009, v. 22, no. 1, p. 21-47 How to cite?
Journal: International journal of computer processing of languages 
Abstract: This paper presents the design and construction of an annotated Chinese collocation bank as the resource to support systematic research on Chinese collocations. The definition and properties are first studied. Based on a combination of different properties, a classification scheme is proposed to categorize Chinese collocations into four types. With the help of computational tools, bigram collocations and n-gram collocations of 3,643 headwords are manually identified in a 5-million-word corpus. Furthermore, for each identified bigram collocation, its dependency relation, chunking information and classification are annotated to produce a collocation bank. Currently, the Chinese collocation bank contains 23,581 bigram collocations and 2,752 n-gram collocations. The Chinese collocation bank is a valuable resource for Chinese collocation related research. Through statistical analysis on the collocation bank, some interesting characteristics of Chinese bigram collocations are presented in this paper.
URI: http://hdl.handle.net/10397/28986
ISSN: 1793-8406
DOI: 10.1142/S1793840609002019
Appears in Collections:Journal/Magazine Article

Access
View full-text via PolyU eLinks SFX Query
Show full item record

Page view(s)

30
Last Week
0
Last month
Checked on Jun 25, 2017

Google ScholarTM

Check

Altmetric



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.