Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/94845
PIRA download icon_1.1View/Download Full Text
DC FieldValueLanguage
dc.contributorDepartment of Chinese and Bilingual Studiesen_US
dc.creatorHou, Ren_US
dc.creatorHuang, CRen_US
dc.date.accessioned2022-08-30T07:33:10Z-
dc.date.available2022-08-30T07:33:10Z-
dc.identifier.issn1351-3249en_US
dc.identifier.urihttp://hdl.handle.net/10397/94845-
dc.language.isoenen_US
dc.publisherCambridge University Pressen_US
dc.rightsThis article has been published in a revised form in Natural Language Engineering https://doi.org/10.1017/S135132491900010X. This version is free to view and download for private research and study only. Not for re-distribution or re-use. © Cambridge University Press 2019.en_US
dc.subjectAuthor identificationen_US
dc.subjectQuantitative stylisticsen_US
dc.subjectRandom foresten_US
dc.subjectStylometricsen_US
dc.subjectSVMen_US
dc.subjectTone and rime motifsen_US
dc.titleRobust stylometric analysis and author attribution based on tones and rimesen_US
dc.typeJournal/Magazine Articleen_US
dc.identifier.spage49en_US
dc.identifier.epage71en_US
dc.identifier.volume26en_US
dc.identifier.issue1en_US
dc.identifier.doi10.1017/S135132491900010Xen_US
dcterms.abstractIn this article, we propose an innovative and robust approach to stylometric analysis without annotation and leveraging lexical and sub-lexical information. In particular, we propose to leverage the phonological information of tones and rimes in Mandarin Chinese automatically extracted from unannotated texts. The texts from different authors were represented by tones, tone motifs, and word length motifs as well as rimes and rime motifs. Support vector machines and random forests were used to establish the text classification model for authorship attribution. From the results of the experiments, we conclude that the combination of bigrams of rimes, word-final rimes, and segment-final rimes can discriminate the texts from different authors effectively when using random forests to establish the classification model. This robust approach can in principle be applied to other languages with established phonological inventory of onset and rimes.en_US
dcterms.accessRightsopen accessen_US
dcterms.bibliographicCitationNatural language engineering, Jan. 2020, v. 26, no. 1, p. 49-71en_US
dcterms.isPartOfNatural language engineeringen_US
dcterms.issued2020-01-
dc.identifier.scopus2-s2.0-85064947362-
dc.identifier.eissn1469-8110en_US
dc.description.validate202208 bckwen_US
dc.description.oaAccepted Manuscripten_US
dc.identifier.FolderNumbera1330, CBS-0152en_US
dc.identifier.SubFormID44610-
dc.description.fundingSourceOthersen_US
dc.description.fundingTextNational Social Science Fund in China (Grant Number: 16BYY110), The Hong Polytechnic University Grant 4-ZZFEen_US
dc.description.pubStatusPublisheden_US
dc.identifier.OPUS14448224en_US
dc.description.oaCategoryGreen (AAM)en_US
Appears in Collections:Journal/Magazine Article
Files in This Item:
File Description SizeFormat 
Hou_Stylometric_Tones_Rimes.pdfPre-Published version1.54 MBAdobe PDFView/Open
Open Access Information
Status open access
File Version Final Accepted Manuscript
Access
View full-text via PolyU eLinks SFX Query
Show simple item record

Page views

40
Last Week
0
Last month
Citations as of Apr 14, 2025

Downloads

198
Citations as of Apr 14, 2025

SCOPUSTM   
Citations

17
Citations as of Dec 19, 2025

WEB OF SCIENCETM
Citations

15
Citations as of Oct 10, 2024

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.