Please use this identifier to cite or link to this item:
http://hdl.handle.net/10397/94845
| DC Field | Value | Language |
|---|---|---|
| dc.contributor | Department of Chinese and Bilingual Studies | en_US |
| dc.creator | Hou, R | en_US |
| dc.creator | Huang, CR | en_US |
| dc.date.accessioned | 2022-08-30T07:33:10Z | - |
| dc.date.available | 2022-08-30T07:33:10Z | - |
| dc.identifier.issn | 1351-3249 | en_US |
| dc.identifier.uri | http://hdl.handle.net/10397/94845 | - |
| dc.language.iso | en | en_US |
| dc.publisher | Cambridge University Press | en_US |
| dc.rights | This article has been published in a revised form in Natural Language Engineering https://doi.org/10.1017/S135132491900010X. This version is free to view and download for private research and study only. Not for re-distribution or re-use. © Cambridge University Press 2019. | en_US |
| dc.subject | Author identification | en_US |
| dc.subject | Quantitative stylistics | en_US |
| dc.subject | Random forest | en_US |
| dc.subject | Stylometrics | en_US |
| dc.subject | SVM | en_US |
| dc.subject | Tone and rime motifs | en_US |
| dc.title | Robust stylometric analysis and author attribution based on tones and rimes | en_US |
| dc.type | Journal/Magazine Article | en_US |
| dc.identifier.spage | 49 | en_US |
| dc.identifier.epage | 71 | en_US |
| dc.identifier.volume | 26 | en_US |
| dc.identifier.issue | 1 | en_US |
| dc.identifier.doi | 10.1017/S135132491900010X | en_US |
| dcterms.abstract | In this article, we propose an innovative and robust approach to stylometric analysis without annotation and leveraging lexical and sub-lexical information. In particular, we propose to leverage the phonological information of tones and rimes in Mandarin Chinese automatically extracted from unannotated texts. The texts from different authors were represented by tones, tone motifs, and word length motifs as well as rimes and rime motifs. Support vector machines and random forests were used to establish the text classification model for authorship attribution. From the results of the experiments, we conclude that the combination of bigrams of rimes, word-final rimes, and segment-final rimes can discriminate the texts from different authors effectively when using random forests to establish the classification model. This robust approach can in principle be applied to other languages with established phonological inventory of onset and rimes. | en_US |
| dcterms.accessRights | open access | en_US |
| dcterms.bibliographicCitation | Natural language engineering, Jan. 2020, v. 26, no. 1, p. 49-71 | en_US |
| dcterms.isPartOf | Natural language engineering | en_US |
| dcterms.issued | 2020-01 | - |
| dc.identifier.scopus | 2-s2.0-85064947362 | - |
| dc.identifier.eissn | 1469-8110 | en_US |
| dc.description.validate | 202208 bckw | en_US |
| dc.description.oa | Accepted Manuscript | en_US |
| dc.identifier.FolderNumber | a1330, CBS-0152 | en_US |
| dc.identifier.SubFormID | 44610 | - |
| dc.description.fundingSource | Others | en_US |
| dc.description.fundingText | National Social Science Fund in China (Grant Number: 16BYY110), The Hong Polytechnic University Grant 4-ZZFE | en_US |
| dc.description.pubStatus | Published | en_US |
| dc.identifier.OPUS | 14448224 | en_US |
| dc.description.oaCategory | Green (AAM) | en_US |
| Appears in Collections: | Journal/Magazine Article | |
Files in This Item:
| File | Description | Size | Format | |
|---|---|---|---|---|
| Hou_Stylometric_Tones_Rimes.pdf | Pre-Published version | 1.54 MB | Adobe PDF | View/Open |
Page views
40
Last Week
0
0
Last month
Citations as of Apr 14, 2025
Downloads
198
Citations as of Apr 14, 2025
SCOPUSTM
Citations
17
Citations as of Dec 19, 2025
WEB OF SCIENCETM
Citations
15
Citations as of Oct 10, 2024
Google ScholarTM
Check
Altmetric
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.



