Leveraging ChatGPT to empower training-free dataset condensation for content-based recommendation

Wu, J; Liu, Q; Hu, H; Fan, W; Liu, S; Li, Q; Wu, XM; Tang, K

doi:10.1145/3701716.3715555

Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/115927

Title:	Leveraging ChatGPT to empower training-free dataset condensation for content-based recommendation
Authors:	Wu, J Liu, Q Hu, H Fan, W Liu, S Li, Q Wu, XM Tang, K
Issue Date:	2025
Source:	In WWW Companion ’25: Companion Proceedings of the ACM: Web Conference 2025, p. 1402-1406. New York, NY: The Association for Computing Machinery, 2025
Abstract:	Modern Content-Based Recommendation (CBR) techniques utilize item content to deliver personalized services, effectively mitigating information overload. However, these methods often require resource-intensive training on large datasets. To address this issue, we explore dataset condensation for textual CBR in this paper. Dataset condensation aims to synthesize a compact yet informative dataset, enabling models to achieve performance comparable to those trained on full datasets. Applying existing approaches to CBR presents two key challenges: (1) the difficulty of synthesizing discrete texts and (2) the inability to preserve user-item preference information. To overcome these limitations, we propose TF-DCon, an efficient dataset condensation method for CBR. TF-DCon employs a prompt-evolution module to guide ChatGPT in condensing discrete texts and integrates a clustering-based module to condense user preferences effectively. Extensive experiments conducted on three real-world datasets demonstrate TF-DCon's effectiveness. Notably, we are able to approximate up to 97% of the original performance while reducing the dataset size by 95% (i.e., dataset MIND). We have released our code and data for other researchers to reproduce our results.
Keywords:	Dataset condensation Large language model Recommender system
Publisher:	The Association for Computing Machinery
ISBN:	979-8-4007-1331-6
DOI:	10.1145/3701716.3715555
Description:	WWW '25: The ACM Web Conference 2025, Sydney NSW Australia, 28 April 2025 - 2 May 2025
Rights:	This work is licensed under a Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/legalcode). WWW Companion ’25, April 28-May 2, 2025, Sydney, NSW, Australia ©2025 Copyright held by the owner/author(s). The following publication Wu, J., Liu, Q., Hu, H., Fan, W., Liu, S., Li, Q., Wu, X.-M., & Tang, K. (2025). Leveraging ChatGPT to Empower Training-free Dataset Condensation for Content-based Recommendation Companion Proceedings of the ACM on Web Conference 2025, Sydney NSW, Australia (pp. 1402-1406) is available at https://doi.org/10.1145/3701716.3715555.
Appears in Collections:	Conference Paper

Files in This Item:

File	Description	Size	Format
3701716.3715555.pdf		1.7 MB	Adobe PDF	View/Open

Open Access Information

Status	open access
File Version	Version of Record

Access

View full-text via PolyU eLinks

Show full item record

Google Scholar^TM

Check

Files in This Item:

Open Access Information

Access

Google ScholarTM

Altmetric

Google Scholar^TM