Please use this identifier to cite or link to this item:
http://hdl.handle.net/10397/110881
| DC Field | Value | Language |
|---|---|---|
| dc.contributor | School of Professional Education and Executive Development | - |
| dc.contributor | Department of Computing | - |
| dc.creator | Xu, CH | - |
| dc.creator | Wang, J | - |
| dc.creator | Zhu, XH | - |
| dc.creator | Yue, Y | - |
| dc.creator | Zhou, WF | - |
| dc.creator | Liang, ZX | - |
| dc.creator | Wojtczak, D | - |
| dc.date.accessioned | 2025-02-14T07:17:28Z | - |
| dc.date.available | 2025-02-14T07:17:28Z | - |
| dc.identifier.issn | 2199-4536 | - |
| dc.identifier.uri | http://hdl.handle.net/10397/110881 | - |
| dc.language.iso | en | en_US |
| dc.publisher | SpringerOpen | en_US |
| dc.rights | © The Author(s) 2024 | en_US |
| dc.rights | Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. | en_US |
| dc.rights | The following publication Xu, C., Wang, J., Zhu, X. et al. Decentralized multi-agent cooperation via adaptive partner modeling. Complex Intell. Syst. 10, 4989–5004 (2024) is available at https://dx.doi.org/10.1007/s40747-024-01421-3. | en_US |
| dc.subject | Multi-agent reinforcement learning | en_US |
| dc.subject | Fictitious self play | en_US |
| dc.subject | Partner modeling | en_US |
| dc.subject | Partner sample complexity | en_US |
| dc.title | Decentralized multi-agent cooperation via adaptive partner modeling | en_US |
| dc.type | Journal/Magazine Article | en_US |
| dc.identifier.spage | 4989 | - |
| dc.identifier.epage | 5004 | - |
| dc.identifier.volume | 10 | - |
| dc.identifier.doi | 10.1007/s40747-024-01421-3 | - |
| dcterms.abstract | Multi-agent reinforcement learning encounters a non-stationary challenge, where agents concurrently update their policies, leading to changes in the environment. Existing approaches have tackled this challenge through communication among agents to obtain their partners' actions, but this introduces computational complexity known as partner sample complexity. An alternative approach is to develop partner models that generate samples instead of direct communication to mitigate this complexity. However, a discrepancy arises between the real policies distribution and the policy of partner models, termed as model bias, which can significantly impact performance when heavily relying on partner models. In order to achieve a trade-off between sample complexity and performance, a novel multi-agent model-based reinforcement learning algorithm called decentralized adaptive partner modeling (DAPM) is proposed, which utilizes fictitious self play (FSP) to construct partner models and update policies. Model bias is addressed by establishing an upper bound to restrict the usage of partner models. Coupled with that, an adaptive rollout approach is introduced, enabling real agents to dynamically communicate with partner models based on their quality, ensuring that agent performance can progressively improve with partner model samples. The effectiveness of DAPM is exhibited in two multi-agent tasks, showing that DAPM outperforms existing model-free algorithms in terms of partner sample complexity and training stability. Specifically, DAPM requires 28.5% fewer communications compared to the best baseline and exhibits reduced fluctuations in the learning curve, indicating superior performance. | - |
| dcterms.accessRights | open access | en_US |
| dcterms.bibliographicCitation | Complex & intelligent systems, 2024, v. 10, no. , p. 4989-5004 | - |
| dcterms.isPartOf | Complex & intelligent systems | - |
| dcterms.issued | 2024 | - |
| dc.identifier.isi | WOS:001220476300003 | - |
| dc.identifier.eissn | 2198-6053 | - |
| dc.description.validate | 202502 bcrc | - |
| dc.description.oa | Version of Record | en_US |
| dc.identifier.FolderNumber | OA_Scopus/WOS | en_US |
| dc.description.fundingSource | Others | en_US |
| dc.description.fundingText | Suzhou Science and Technology Project | en_US |
| dc.description.fundingText | Research Development Fund of XJTLU | en_US |
| dc.description.fundingText | Key Programme Special Fund of XJTLU and Suzhou Municipal Key Laboratory for Intelligent Virtual Engineering | en_US |
| dc.description.pubStatus | Published | en_US |
| dc.description.oaCategory | CC | en_US |
| Appears in Collections: | Journal/Magazine Article | |
Files in This Item:
| File | Description | Size | Format | |
|---|---|---|---|---|
| s40747-024-01421-3.pdf | 1.94 MB | Adobe PDF | View/Open |
Page views
9
Citations as of Apr 14, 2025
Downloads
6
Citations as of Apr 14, 2025
WEB OF SCIENCETM
Citations
1
Citations as of Dec 18, 2025
Google ScholarTM
Check
Altmetric
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.



