Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/92348
PIRA download icon_1.1View/Download Full Text
Title: Character profiling in low-resource language documents
Authors: Wong, TS 
Lee, J
Issue Date: 2019
Source: In G Demartini & P Thomas (Eds.), ADCS 2019 : proceedings of the 24th Australasian Document Computing Symposium : Sydney, Australia, December 5-6, 2019. New York, NY, United States : Association for Computing Machinery, 2019.
Abstract: This paper focuses on automatic character profiling — connecting “who”, “what” and “when” — in literary documents. This task is especially challenging for low-resource languages, since off-the-shelf tools for named entity recognition, syntactic parsing and other natural language processing tasks are rarely available. We investigate the impact of human annotation on automatic profiling. Based on a Medieval Chinese corpus, experimental results show that even a relatively small amount of word segmentation, part-of-speech and dependency annotation can improve accuracy in named entity recognition and in identifying character-verb associations, but not character-toponym associations.
Keywords: Dependency parsing
Information extraction
Low-resource language
Medieval Chinese
Named entity recognition
Publisher: Association for Computing Machinery
ISBN: 978-1-4503-7766-9
DOI: 10.1145/3372124.3372129
Rights: © 2019 Association for Computing Machinery.
This is the accepted version of the publication Tak-sum Wong and John Lee. 2019. Character Profiling in Low-Resource Language Documents. In Proceedings of the 24th Australasian Document Computing Symposium (ADCS '19). Association for Computing Machinery, New York, NY, USA, Article 5, 1-4. The final published version of record is available at https://dx.doi.org/10.1145/3372124.3372129
Appears in Collections:Conference Paper

Files in This Item:
File Description SizeFormat 
ADCS2019_Buddhist_cameraready.pdfPre-Published version394 kBAdobe PDFView/Open
Open Access Information
Status open access
File Version Final Accepted Manuscript
Access
View full-text via PolyU eLinks SFX Query
Show full item record

Page views

54
Last Week
0
Last month
Citations as of Apr 28, 2024

Downloads

35
Citations as of Apr 28, 2024

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.