Predicting the next sentence (not word) in large language models : what model-brain alignment tells us about discourse comprehension

Yu, S; Gu, C; Huang, K; Li, P

doi:10.1126/sciadv.adn7744

Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/106645

DC Field	Value	Language
dc.contributor	Department of Chinese and Bilingual Studies	en_US
dc.creator	Yu, S	en_US
dc.creator	Gu, C	en_US
dc.creator	Huang, K	en_US
dc.creator	Li, P	en_US
dc.date.accessioned	2024-05-27T02:13:23Z	-
dc.date.available	2024-05-27T02:13:23Z	-
dc.identifier.uri	http://hdl.handle.net/10397/106645	-
dc.language.iso	en	en_US
dc.publisher	American Association for the Advancement of Science (AAAS)	en_US
dc.rights	Copyright © 2024 the Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works. Distributed under a Creative Commons Attribution NonCommercial License 4.0 (CC BY-NC) (https://creativecommons.org/licenses/by-nc/4.0/).	en_US
dc.rights	The following publication Shaoyun Yu et al., Predicting the next sentence (not word) in large language models: What model-brain alignment tells us about discourse comprehension.Sci. Adv.10,eadn7744 (2024) is available at https://doi.org/10.1126/sciadv.adn7744.	en_US
dc.title	Predicting the next sentence (not word) in large language models : what model-brain alignment tells us about discourse comprehension	en_US
dc.type	Journal/Magazine Article	en_US
dc.identifier.volume	10	en_US
dc.identifier.issue	21	en_US
dc.identifier.doi	10.1126/sciadv.adn7744	en_US
dcterms.abstract	Current large language models (LLMs) rely on word prediction as their backbone pretraining task. Although word prediction is an important mechanism underlying language processing, human language comprehension occurs at multiple levels, involving the integration of words and sentences to achieve a full understanding of discourse. This study models language comprehension by using the next sentence prediction (NSP) task to investigate mechanisms of discourse-level comprehension. We show that NSP pretraining enhanced a model’s alignment with brain data especially in the right hemisphere and in the multiple demand network, highlighting the contributions of nonclassical language regions to high-level language understanding. Our results also suggest that NSP can enable the model to better capture human comprehension performance and to better encode contextual information. Our study demonstrates that the inclusion of diverse learning objectives in a model leads to more human-like representations, and investigating the neurocognitive plausibility of pretraining tasks in LLMs can shed light on outstanding questions in language neuroscience.	en_US
dcterms.accessRights	open access	en_US
dcterms.bibliographicCitation	Science advances, 24 May 2024, v. 10, no. 21, eadn7744	en_US
dcterms.isPartOf	Science advances	en_US
dcterms.issued	2024-05-24	-
dc.identifier.eissn	2375-2548	en_US
dc.identifier.artn	eadn7744	en_US
dc.description.validate	202405 bcch	en_US
dc.description.oa	Version of Record	en_US
dc.identifier.FolderNumber	a2715	-
dc.identifier.SubFormID	48115	-
dc.description.fundingSource	RGC	en_US
dc.description.fundingSource	Others	en_US
dc.description.fundingText	NCS-FO-1533625	en_US
dc.description.pubStatus	Published	en_US
dc.description.oaCategory	CC	en_US
Appears in Collections:	Journal/Magazine Article

Files in This Item:

File	Description	Size	Format
sciadv.adn7744.pdf		1.44 MB	Adobe PDF	View/Open

Open Access Information

Status	open access
File Version	Version of Record

Access

View full-text via PolyU eLinks

Show simple item record

Page views

11

Citations as of Jun 30, 2024

Downloads

9

Citations as of Jun 30, 2024

Google Scholar^TM

Check

Files in This Item:

Open Access Information

Access

Page views

Downloads

Google ScholarTM

Altmetric

Google Scholar^TM