Please use this identifier to cite or link to this item:
http://hdl.handle.net/10397/106645
DC Field | Value | Language |
---|---|---|
dc.contributor | Department of Chinese and Bilingual Studies | en_US |
dc.creator | Yu, S | en_US |
dc.creator | Gu, C | en_US |
dc.creator | Huang, K | en_US |
dc.creator | Li, P | en_US |
dc.date.accessioned | 2024-05-27T02:13:23Z | - |
dc.date.available | 2024-05-27T02:13:23Z | - |
dc.identifier.uri | http://hdl.handle.net/10397/106645 | - |
dc.language.iso | en | en_US |
dc.publisher | American Association for the Advancement of Science (AAAS) | en_US |
dc.rights | Copyright © 2024 the Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works. Distributed under a Creative Commons Attribution NonCommercial License 4.0 (CC BY-NC) (https://creativecommons.org/licenses/by-nc/4.0/). | en_US |
dc.rights | The following publication Shaoyun Yu et al., Predicting the next sentence (not word) in large language models: What model-brain alignment tells us about discourse comprehension.Sci. Adv.10,eadn7744 (2024) is available at https://doi.org/10.1126/sciadv.adn7744. | en_US |
dc.title | Predicting the next sentence (not word) in large language models : what model-brain alignment tells us about discourse comprehension | en_US |
dc.type | Journal/Magazine Article | en_US |
dc.identifier.volume | 10 | en_US |
dc.identifier.issue | 21 | en_US |
dc.identifier.doi | 10.1126/sciadv.adn7744 | en_US |
dcterms.abstract | Current large language models (LLMs) rely on word prediction as their backbone pretraining task. Although word prediction is an important mechanism underlying language processing, human language comprehension occurs at multiple levels, involving the integration of words and sentences to achieve a full understanding of discourse. This study models language comprehension by using the next sentence prediction (NSP) task to investigate mechanisms of discourse-level comprehension. We show that NSP pretraining enhanced a model’s alignment with brain data especially in the right hemisphere and in the multiple demand network, highlighting the contributions of nonclassical language regions to high-level language understanding. Our results also suggest that NSP can enable the model to better capture human comprehension performance and to better encode contextual information. Our study demonstrates that the inclusion of diverse learning objectives in a model leads to more human-like representations, and investigating the neurocognitive plausibility of pretraining tasks in LLMs can shed light on outstanding questions in language neuroscience. | en_US |
dcterms.accessRights | open access | en_US |
dcterms.bibliographicCitation | Science advances, 24 May 2024, v. 10, no. 21, eadn7744 | en_US |
dcterms.isPartOf | Science advances | en_US |
dcterms.issued | 2024-05-24 | - |
dc.identifier.eissn | 2375-2548 | en_US |
dc.identifier.artn | eadn7744 | en_US |
dc.description.validate | 202405 bcch | en_US |
dc.description.oa | Version of Record | en_US |
dc.identifier.FolderNumber | a2715 | - |
dc.identifier.SubFormID | 48115 | - |
dc.description.fundingSource | RGC | en_US |
dc.description.fundingSource | Others | en_US |
dc.description.fundingText | NCS-FO-1533625 | en_US |
dc.description.pubStatus | Published | en_US |
dc.description.oaCategory | CC | en_US |
Appears in Collections: | Journal/Magazine Article |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
sciadv.adn7744.pdf | 1.44 MB | Adobe PDF | View/Open |
Page views
11
Citations as of Jun 30, 2024
Downloads
9
Citations as of Jun 30, 2024
![](/image/google_scholar.jpg)
Google ScholarTM
Check
Altmetric
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.