Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/118566
PIRA download icon_1.1View/Download Full Text
Title: HICL : hashtag-driven in-context learning for social media natural language understanding
Authors: Tan, H 
Xu, C 
Li, J 
Zhang, Y
Fang, Z
Chen, Z
Lai, B
Issue Date: Apr-2025
Source: IEEE transactions on neural networks and learning systems, Apr. 2025, v. 36, no. 4, p. 7037-7050
Abstract: Natural language understanding (NLU) is integral to various social media applications. However, the existing NLU models rely heavily on context for semantic learning, resulting in compromised performance when faced with short and noisy social media content. To address this issue, we leverage in-context learning (ICL), wherein language models learn to make inferences by conditioning on a handful of demonstrations to enrich the context and propose a novel hashtag-driven ICL (HICL) framework. Concretely, we pretrain a model #Encoder, which employs #hashtags (user-annotated topic labels) to drive BERT-based pretraining through contrastive learning. Our objective here is to enable #Encoder to gain the ability to incorporate topic-related semantic information, which allows it to retrieve topic-related posts to enrich contexts and enhance social media NLU with noisy contexts. To further integrate the retrieved context with the source text, we employ a gradient-based method to identify trigger terms useful in fusing information from both sources. For empirical studies, we collected 45 M tweets to set up an in-context NLU benchmark, and the experimental results on seven downstream tasks show that HICL substantially advances the previous state-of-the-art results. Furthermore, we conducted an extensive analysis and found that the following hold: 1) combining source input with a top-retrieved post from #Encoder is more effective than using semantically similar posts and 2) trigger words can largely benefit in merging context from the source and retrieved posts.
Keywords: In-context learning (ICL)
Natural language processing
Pretrained language model
Social media
Publisher: Institute of Electrical and Electronics Engineers
Journal: IEEE transactions on neural networks and learning systems 
ISSN: 2162-237X
EISSN: 2162-2388
DOI: 10.1109/TNNLS.2024.3384987
Research Data: https://github.com/albertan017/HICL
Rights: © 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
The following publication H. Tan et al., 'HICL: Hashtag-Driven In-Context Learning for Social Media Natural Language Understanding,' in IEEE Transactions on Neural Networks and Learning Systems, vol. 36, no. 4, pp. 7037-7050, April 2025 is available at https://doi.org/10.1109/TNNLS.2024.3384987.
Appears in Collections:Journal/Magazine Article

Files in This Item:
File Description SizeFormat 
Tan_Hicl_Hashtag-driven_In-context.pdfPre-Published version2.72 MBAdobe PDFView/Open
Open Access Information
Status open access
File Version Final Accepted Manuscript
Access
View full-text via PolyU eLinks SFX Query
Show full item record

SCOPUSTM   
Citations

2
Citations as of May 8, 2026

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.