Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/106645
PIRA download icon_1.1View/Download Full Text
DC FieldValueLanguage
dc.contributorDepartment of Chinese and Bilingual Studiesen_US
dc.creatorYu, Sen_US
dc.creatorGu, Cen_US
dc.creatorHuang, Ken_US
dc.creatorLi, Pen_US
dc.date.accessioned2024-05-27T02:13:23Z-
dc.date.available2024-05-27T02:13:23Z-
dc.identifier.urihttp://hdl.handle.net/10397/106645-
dc.language.isoenen_US
dc.publisherAmerican Association for the Advancement of Science (AAAS)en_US
dc.rightsCopyright © 2024 the Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works. Distributed under a Creative Commons Attribution NonCommercial License 4.0 (CC BY-NC) (https://creativecommons.org/licenses/by-nc/4.0/).en_US
dc.rightsThe following publication Shaoyun Yu et al., Predicting the next sentence (not word) in large language models: What model-brain alignment tells us about discourse comprehension.Sci. Adv.10,eadn7744 (2024) is available at https://doi.org/10.1126/sciadv.adn7744.en_US
dc.titlePredicting the next sentence (not word) in large language models : what model-brain alignment tells us about discourse comprehensionen_US
dc.typeJournal/Magazine Articleen_US
dc.identifier.volume10en_US
dc.identifier.issue21en_US
dc.identifier.doi10.1126/sciadv.adn7744en_US
dcterms.abstractCurrent large language models (LLMs) rely on word prediction as their backbone pretraining task. Although word prediction is an important mechanism underlying language processing, human language comprehension occurs at multiple levels, involving the integration of words and sentences to achieve a full understanding of discourse. This study models language comprehension by using the next sentence prediction (NSP) task to investigate mechanisms of discourse-level comprehension. We show that NSP pretraining enhanced a model’s alignment with brain data especially in the right hemisphere and in the multiple demand network, highlighting the contributions of nonclassical language regions to high-level language understanding. Our results also suggest that NSP can enable the model to better capture human comprehension performance and to better encode contextual information. Our study demonstrates that the inclusion of diverse learning objectives in a model leads to more human-like representations, and investigating the neurocognitive plausibility of pretraining tasks in LLMs can shed light on outstanding questions in language neuroscience.en_US
dcterms.accessRightsopen accessen_US
dcterms.bibliographicCitationScience advances, 24 May 2024, v. 10, no. 21, eadn7744en_US
dcterms.isPartOfScience advancesen_US
dcterms.issued2024-05-24-
dc.identifier.eissn2375-2548en_US
dc.identifier.artneadn7744en_US
dc.description.validate202405 bcchen_US
dc.description.oaVersion of Recorden_US
dc.identifier.FolderNumbera2715-
dc.identifier.SubFormID48115-
dc.description.fundingSourceRGCen_US
dc.description.fundingSourceOthersen_US
dc.description.fundingTextNCS-FO-1533625en_US
dc.description.pubStatusPublisheden_US
dc.description.oaCategoryCCen_US
Appears in Collections:Journal/Magazine Article
Files in This Item:
File Description SizeFormat 
sciadv.adn7744.pdf1.44 MBAdobe PDFView/Open
Open Access Information
Status open access
File Version Version of Record
Access
View full-text via PolyU eLinks SFX Query
Show simple item record

Page views

11
Citations as of Jun 30, 2024

Downloads

9
Citations as of Jun 30, 2024

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.