HICL : hashtag-driven in-context learning for social media natural language understanding

Tan, H; Xu, C; Li, J; Zhang, Y; Fang, Z; Chen, Z; Lai, B

doi:10.1109/TNNLS.2024.3384987

Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/118566

DC Field	Value	Language
dc.contributor	Department of Computing	-
dc.creator	Tan, H	-
dc.creator	Xu, C	-
dc.creator	Li, J	-
dc.creator	Zhang, Y	-
dc.creator	Fang, Z	-
dc.creator	Chen, Z	-
dc.creator	Lai, B	-
dc.date.accessioned	2026-04-24T03:01:29Z	-
dc.date.available	2026-04-24T03:01:29Z	-
dc.identifier.issn	2162-237X	-
dc.identifier.uri	http://hdl.handle.net/10397/118566	-
dc.language.iso	en	en_US
dc.publisher	Institute of Electrical and Electronics Engineers	en_US
dc.rights	© 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.	en_US
dc.rights	The following publication H. Tan et al., 'HICL: Hashtag-Driven In-Context Learning for Social Media Natural Language Understanding,' in IEEE Transactions on Neural Networks and Learning Systems, vol. 36, no. 4, pp. 7037-7050, April 2025 is available at https://doi.org/10.1109/TNNLS.2024.3384987.	en_US
dc.subject	In-context learning (ICL)	en_US
dc.subject	Natural language processing	en_US
dc.subject	Pretrained language model	en_US
dc.subject	Social media	en_US
dc.title	HICL : hashtag-driven in-context learning for social media natural language understanding	en_US
dc.type	Journal/Magazine Article	en_US
dc.identifier.spage	7037	-
dc.identifier.epage	7050	-
dc.identifier.volume	36	-
dc.identifier.issue	4	-
dc.identifier.doi	10.1109/TNNLS.2024.3384987	-
dcterms.abstract	Natural language understanding (NLU) is integral to various social media applications. However, the existing NLU models rely heavily on context for semantic learning, resulting in compromised performance when faced with short and noisy social media content. To address this issue, we leverage in-context learning (ICL), wherein language models learn to make inferences by conditioning on a handful of demonstrations to enrich the context and propose a novel hashtag-driven ICL (HICL) framework. Concretely, we pretrain a model #Encoder, which employs #hashtags (user-annotated topic labels) to drive BERT-based pretraining through contrastive learning. Our objective here is to enable #Encoder to gain the ability to incorporate topic-related semantic information, which allows it to retrieve topic-related posts to enrich contexts and enhance social media NLU with noisy contexts. To further integrate the retrieved context with the source text, we employ a gradient-based method to identify trigger terms useful in fusing information from both sources. For empirical studies, we collected 45 M tweets to set up an in-context NLU benchmark, and the experimental results on seven downstream tasks show that HICL substantially advances the previous state-of-the-art results. Furthermore, we conducted an extensive analysis and found that the following hold: 1) combining source input with a top-retrieved post from #Encoder is more effective than using semantically similar posts and 2) trigger words can largely benefit in merging context from the source and retrieved posts.	-
dcterms.accessRights	open access	en_US
dcterms.bibliographicCitation	IEEE transactions on neural networks and learning systems, Apr. 2025, v. 36, no. 4, p. 7037-7050	-
dcterms.isPartOf	IEEE transactions on neural networks and learning systems	-
dcterms.issued	2025-04	-
dc.identifier.scopus	2-s2.0-105002373430	-
dc.identifier.pmid	38619957	-
dc.identifier.eissn	2162-2388	-
dc.description.validate	202604 bcjz	-
dc.description.oa	Accepted Manuscript	en_US
dc.identifier.SubFormID	G001535/2025-12	en_US
dc.description.fundingSource	RGC	en_US
dc.description.fundingSource	Others	en_US
dc.description.fundingText	This work was supported in part by the Research Grants Council of the Hong Kong Special Administrative Region, China, under Project PolyU/25200821; in part by the Innovation and Technology Fund under Project PRP/047/22FX; in part by the National Natural Science Foundation of China Young Scientists Fund under Grant 62006203; in part by the National Natural Science Foundation of China under Grant 62372220; and in part by the China Computer Federation-Baidu Open Research Fund under Grant 2021PP15002000.	en_US
dc.description.pubStatus	Published	en_US
dc.description.oaCategory	Green (AAM)	en_US
dc.relation.rdata	https://github.com/albertan017/HICL	-
Appears in Collections:	Journal/Magazine Article

Files in This Item:

File	Description	Size	Format
Tan_Hicl_Hashtag-driven_In-context.pdf	Pre-Published version	2.72 MB	Adobe PDF	View/Open

Open Access Information

Status	open access
File Version	Final Accepted Manuscript

Access

View full-text via PolyU eLinks

Show simple item record

SCOPUS^TM
Citations

2

Citations as of May 8, 2026

Google Scholar^TM

Check

Files in This Item:

Open Access Information

Access

SCOPUSTM Citations

Google ScholarTM

Altmetric

SCOPUS^TM
Citations

Google Scholar^TM