Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/94845
PIRA download icon_1.1View/Download Full Text
Title: Robust stylometric analysis and author attribution based on tones and rimes
Authors: Hou, R 
Huang, CR 
Issue Date: Jan-2020
Source: Natural language engineering, Jan. 2020, v. 26, no. 1, p. 49-71
Abstract: In this article, we propose an innovative and robust approach to stylometric analysis without annotation and leveraging lexical and sub-lexical information. In particular, we propose to leverage the phonological information of tones and rimes in Mandarin Chinese automatically extracted from unannotated texts. The texts from different authors were represented by tones, tone motifs, and word length motifs as well as rimes and rime motifs. Support vector machines and random forests were used to establish the text classification model for authorship attribution. From the results of the experiments, we conclude that the combination of bigrams of rimes, word-final rimes, and segment-final rimes can discriminate the texts from different authors effectively when using random forests to establish the classification model. This robust approach can in principle be applied to other languages with established phonological inventory of onset and rimes.
Keywords: Author identification
Quantitative stylistics
Random forest
Stylometrics
SVM
Tone and rime motifs
Publisher: Cambridge University Press
Journal: Natural language engineering 
ISSN: 1351-3249
EISSN: 1469-8110
DOI: 10.1017/S135132491900010X
Rights: This article has been published in a revised form in Natural Language Engineering https://doi.org/10.1017/S135132491900010X. This version is free to view and download for private research and study only. Not for re-distribution or re-use. © Cambridge University Press 2019.
Appears in Collections:Journal/Magazine Article

Files in This Item:
File Description SizeFormat 
Hou_Stylometric_Tones_Rimes.pdfPre-Published version1.54 MBAdobe PDFView/Open
Open Access Information
Status open access
File Version Final Accepted Manuscript
Access
View full-text via PolyU eLinks SFX Query
Show full item record

Page views

40
Last Week
0
Last month
Citations as of Apr 14, 2025

Downloads

198
Citations as of Apr 14, 2025

SCOPUSTM   
Citations

17
Citations as of Dec 19, 2025

WEB OF SCIENCETM
Citations

15
Citations as of Oct 10, 2024

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.