Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/108877
PIRA download icon_1.1View/Download Full Text
DC FieldValueLanguage
dc.contributorDepartment of Chinese and Bilingual Studiesen_US
dc.contributorDepartment of Computingen_US
dc.contributorDepartment of Applied Mathematicsen_US
dc.creatorXiang, Ren_US
dc.creatorChersoni, Een_US
dc.creatorLi, Yen_US
dc.creatorLi, Jen_US
dc.creatorHuang, CRen_US
dc.creatorPan, Yen_US
dc.creatorLi, Yen_US
dc.date.accessioned2024-09-04T07:42:12Z-
dc.date.available2024-09-04T07:42:12Z-
dc.identifier.issn1574-020Xen_US
dc.identifier.urihttp://hdl.handle.net/10397/108877-
dc.language.isoenen_US
dc.publisherSpringer Dordrechten_US
dc.rights© The Author(s) 2024en_US
dc.rightsThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.en_US
dc.rightsThe following publication Xiang, R., Chersoni, E., Li, Y. et al. Cantonese natural language processing in the transformers era: a survey and current challenges. Lang Resources & Evaluation 59, 1747–1773 (2025) is available at https://doi.org/10.1007/s10579-024-09744-w.en_US
dc.subjectCantoneseen_US
dc.subjectCode-switchingen_US
dc.subjectEvaluation resourcesen_US
dc.subjectMultilingualismen_US
dc.subjectNLP for social mediaen_US
dc.titleCantonese natural language processing in the transformers era : a survey and current challengesen_US
dc.typeJournal/Magazine Articleen_US
dc.identifier.spage1747en_US
dc.identifier.epage1773en_US
dc.identifier.volume59en_US
dc.identifier.issue2en_US
dc.identifier.doi10.1007/s10579-024-09744-wen_US
dcterms.abstractDespite being spoken by a large population of speakers worldwide, Cantonese is under-resourced in terms of the data scale and diversity compared to other major languages. This limitation has excluded it from the current “pre-training and fine-tuning” paradigm that is dominated by Transformer architectures. In this paper, we provide a comprehensive review on the existing resources and methodologies for Cantonese Natural Language Processing, covering the recent progress in language understanding, text generation and development of language models. We finally discuss two aspects of the Cantonese language that could make it potentially challenging even for state-of-the-art architectures: colloquialism and multilingualityen_US
dcterms.accessRightsopen accessen_US
dcterms.bibliographicCitationLanguage resources and evaluation, June 2025, v. 59, no. 2, p. 1747-1773en_US
dcterms.isPartOfLanguage resources and evaluationen_US
dcterms.issued2025-06-
dc.identifier.scopus2-s2.0-85195564852-
dc.identifier.eissn1574-0218en_US
dc.description.validate202409 bcchen_US
dc.description.oaVersion of Recorden_US
dc.identifier.FolderNumberOA_TA-
dc.description.fundingSourceSelf-fundeden_US
dc.description.pubStatusPublisheden_US
dc.description.TASpringer Nature (2024)en_US
dc.description.oaCategoryTAen_US
Appears in Collections:Journal/Magazine Article
Files in This Item:
File Description SizeFormat 
s10579-024-09744-w.pdf1.38 MBAdobe PDFView/Open
Open Access Information
Status open access
File Version Version of Record
Access
View full-text via PolyU eLinks SFX Query
Show simple item record

Page views

104
Citations as of Oct 6, 2025

Downloads

36
Citations as of Oct 6, 2025

SCOPUSTM   
Citations

1
Citations as of Oct 24, 2025

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.