Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/116622
PIRA download icon_1.1View/Download Full Text
DC FieldValueLanguage
dc.contributorDepartment of Computingen_US
dc.creatorHong, Men_US
dc.creatorZhang, CJen_US
dc.creatorChen, Cen_US
dc.creatorLian, Ren_US
dc.creatorJiang, Den_US
dc.date.accessioned2026-01-07T03:22:06Z-
dc.date.available2026-01-07T03:22:06Z-
dc.identifier.isbn979-8-89176-194-0en_US
dc.identifier.urihttp://hdl.handle.net/10397/116622-
dc.description2025 Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics, Albuquerque, New Mexico, April 29-May 4, 2025en_US
dc.language.isoenen_US
dc.publisherAssociation for Computational Linguisticsen_US
dc.rights©2025 Association for Computational Linguisticsen_US
dc.rightsMaterials published in or after 2016 are licensed on a Creative Commons Attribution 4.0 International License. (https://creativecommons.org/licenses/by/4.0/)en_US
dc.rightsThe following publication Mengze Hong, Chen Jason Zhang, Chaotao Chen, Rongzhong Lian, and Di Jiang. 2025. Dialogue Language Model with Large-Scale Persona Data Engineering. In Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 3: Industry Track), pages 961–970, Albuquerque, New Mexico. Association for Computational Linguistics is available at https://doi.org/10.18653/v1/2025.naacl-industry.71.en_US
dc.titleDialogue language model with large-scale persona data engineeringen_US
dc.typeConference Paperen_US
dc.identifier.spage961en_US
dc.identifier.epage970en_US
dc.identifier.volume3en_US
dcterms.abstractMaintaining persona consistency is paramount in the application of open-domain dialogue systems, as exemplified by models like ChatGPT. Despite significant advancements, the limited scale and diversity of current persona dialogue datasets remain challenges to achieving robust persona-consistent dialogue models. In this study, drawing inspiration from the success of large-scale pre-training, we introduce PPDS, an open-domain persona dialogue system that employs extensive generative pre-training on a persona dialogue dataset to enhance persona consistency. Specifically, we present a persona extraction model designed to autonomously and precisely generate vast persona dialogue datasets. Additionally, we unveil a pioneering persona augmentation technique to address the invalid persona bias inherent in the constructed dataset. Both quantitative and human evaluations consistently highlight the superior response quality and persona consistency of our proposed model, underscoring its effectiveness.en_US
dcterms.accessRightsopen accessen_US
dcterms.bibliographicCitationIn NAACL 2025 : Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics : Proceedings of the Conference : Industry Track, p. 961-970. Kerrville, USA: Association for Computational Linguistics (ACL), 2025en_US
dcterms.issued2025-
dc.relation.ispartofbookNAACL 2025 : Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics : Proceedings of the Conference : Industry Tracken_US
dc.relation.conferenceAnnual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics [NAACL]en_US
dc.publisher.placeKerrville, USAen_US
dc.description.validate202601 bcchen_US
dc.description.oaVersion of Recorden_US
dc.identifier.FolderNumbera4255b-
dc.identifier.SubFormID52471-
dc.description.fundingSourceRGCen_US
dc.description.pubStatusPublisheden_US
dc.description.oaCategoryCCen_US
Appears in Collections:Journal/Magazine Article
Files in This Item:
File Description SizeFormat 
2025.naacl-industry.71.pdf386 kBAdobe PDFView/Open
Open Access Information
Status open access
File Version Version of Record
Access
View full-text via PolyU eLinks SFX Query
Show simple item record

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.