Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/90394
PIRA download icon_1.1View/Download Full Text
Title: Ciron : a new benchmark dataset for Chinese irony detection
Authors: Xiang, R 
Gao, X 
Long, Y
Li, A 
Chersoni, E 
Lu, Q 
Huang, CR 
Issue Date: May-2020
Source: Proceedings of the 12th Conference on Language Resources and Evaluation, LREC, May 2020, p. 5714-5720
Abstract: Automatic Chinese irony detection is a challenging task, and it has a strong impact on linguistic research. However, Chinese irony detection often lacks labeled benchmark datasets. In this paper, we introduce Ciron, the first Chinese benchmark dataset available for irony detection for machine learning models. Ciron includes more than 8.7K posts, collected from Weibo, a micro blogging platform. Most importantly, Ciron is collected with no pre-conditions to ensure a much wider coverage. Evaluation on seven different machine learning classifiers proves the usefulness of Ciron as an important resource for Chinese irony detection.
Keywords: Irony detection
Chinese benchmark dataset
Social media text
Text processing
Rights: © European Language Resources Association (ELRA), licensed under CC-BY-NC https://creativecommons.org/licenses/by/4.0/
The following publication Xiang, R., Gao, X., Long, Y., Li, A., Chersoni, E., Lu, Q., & Huang, C. R. (2020, May). Ciron: a New Benchmark Dataset for Chinese Irony Detection. In Proceedings of The 12th Language Resources and Evaluation Conference (pp. 5714-5720) is available at https://www.aclweb.org/anthology/2020.lrec-1.701/
Appears in Collections:Conference Paper

Files in This Item:
File Description SizeFormat 
2020.lrec-1.701.pdf241.05 kBAdobe PDFView/Open
Open Access Information
Status open access
File Version Version of Record
Access
View full-text via PolyU eLinks SFX Query
Show full item record

Page views

63
Citations as of May 15, 2022

Downloads

5
Citations as of May 15, 2022

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.