Ciron : a new benchmark dataset for Chinese irony detection

Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/90394

Title:	Ciron : a new benchmark dataset for Chinese irony detection
Authors:	Xiang, R Gao, X Long, Y Li, A Chersoni, E Lu, Q Huang, CR
Issue Date:	May-2020
Source:	Proceedings of the 12th Conference on Language Resources and Evaluation, LREC, May 2020, p. 5714-5720
Abstract:	Automatic Chinese irony detection is a challenging task, and it has a strong impact on linguistic research. However, Chinese irony detection often lacks labeled benchmark datasets. In this paper, we introduce Ciron, the first Chinese benchmark dataset available for irony detection for machine learning models. Ciron includes more than 8.7K posts, collected from Weibo, a micro blogging platform. Most importantly, Ciron is collected with no pre-conditions to ensure a much wider coverage. Evaluation on seven different machine learning classifiers proves the usefulness of Ciron as an important resource for Chinese irony detection.
Keywords:	Irony detection Chinese benchmark dataset Social media text Text processing
Rights:	© European Language Resources Association (ELRA), licensed under CC-BY-NC https://creativecommons.org/licenses/by/4.0/ The following publication Xiang, R., Gao, X., Long, Y., Li, A., Chersoni, E., Lu, Q., & Huang, C. R. (2020, May). Ciron: a New Benchmark Dataset for Chinese Irony Detection. In Proceedings of The 12th Language Resources and Evaluation Conference (pp. 5714-5720) is available at https://www.aclweb.org/anthology/2020.lrec-1.701/
Appears in Collections:	Conference Paper

File	Description	Size	Format
2020.lrec-1.701.pdf		241.05 kB	Adobe PDF	View/Open

Status	open access
File Version	Version of Record

View full-text via PolyU eLinks

412

Last Week
14

Last month

Citations as of Apr 12, 2026

167

Citations as of Apr 12, 2026

Check