Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/67186
Title: Named entity recognition for Chinese novels in the Ming-Qing dynasties
Authors: Long, Y
Xiong, D
Keywords: Chinese vernacular novels
Dependency parsing
Named entity recognition
Wikipedia
Issue Date: 2016
Publisher: Springer
Source: Lecture notes in computer science (including subseries Lecture notes in artificial intelligence and lecture notes in bioinformatics), 2016, v. 10085, p. 362-375 How to cite?
Journal: Lecture notes in computer science (including subseries Lecture notes in artificial intelligence and lecture notes in bioinformatics) 
Abstract: This paper presents a Named Entity Recognition (NER) system for Chinese classic novels in the Ming and Qing dynasties using the Conditional Random Fields (CRFs) method. An annotated corpus of four influential vernacular novels produced during this period is used as both training and testing data. In the experiment, three novels are used as training data and one novel is used as the testing data. Three sets of features are proposed for the CRFs model: (1) baseline feature set, that is, word/POS and bigram for different window sizes, (2) dependency head and dependency relationship, and (3) Wikipedia categories. The F-measures for these four books range from 67% to 80%. Experiments show that using the dependency head and relationship as well as Wikipedia categories can improve the performance of the NER system. Compared with the second feature set, the third one can produce greater improvement.
Description: 17th Chinese Lexical Semantics Workshop, CLSW 2016, Singapore, 20-22 May 2016
URI: http://hdl.handle.net/10397/67186
ISBN: 9783319495071
ISSN: 0302-9743
EISSN: 1611-3349
DOI: 10.1007/978-3-319-49508-8_34
Appears in Collections:Conference Paper

Access
View full-text via PolyU eLinks SFX Query
Show full item record

Page view(s)

68
Last Week
1
Last month
Checked on Aug 13, 2017

Google ScholarTM

Check

Altmetric



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.