Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/117295
DC FieldValueLanguage
dc.contributorDepartment of Aeronautical and Aviation Engineeringen_US
dc.creatorLiu, Men_US
dc.creatorTao, Wen_US
dc.creatorHuang, Hen_US
dc.date.accessioned2026-02-10T03:43:45Z-
dc.date.available2026-02-10T03:43:45Z-
dc.identifier.issn0952-1976en_US
dc.identifier.urihttp://hdl.handle.net/10397/117295-
dc.language.isoenen_US
dc.publisherElsevier Ltden_US
dc.subjectBadmintonen_US
dc.subjectOffline policy evaluationen_US
dc.subjectOffline reinforcement learningen_US
dc.subjectTactical decision-makingen_US
dc.titleOffline reinforcement learning for badminton tactical decision-makingen_US
dc.typeJournal/Magazine Articleen_US
dc.identifier.volume164en_US
dc.identifier.doi10.1016/j.engappai.2025.113395en_US
dcterms.abstractSports data mining is becoming increasingly vital in modern competitive sports, driven by the need for athletes to continuously enhance their performance. Traditional methods of analyzing sports data rely heavily on expert experience and manual effort, which can be inefficient and unreliable. With advancements in artificial intelligence (AI), sports data is now being processed autonomously, providing more quantitative insights and more comprehensive analysis. This paper focuses on the role of tactics in sports, particularly in badminton, and explores the potential of using AI to enhance badminton tactical decision-making. We investigate the application of offline reinforcement learning (Offline RL) to develop tactical policies from pre-collected datasets, addressing challenges including algorithm design and offline policy evaluation. Specifically, we propose a new variant of conservative Q-learning (CQL), tailored for the hybrid action space to train tactical policies using the integrated offline dataset Shuttle. To evaluate these policies, we develop a preference-based reward model that aligns with tactical preferences, offering an alternative to traditional offline policy evaluation methods. Our computer-based experimental results and analysis demonstrate that the proposed method achieves higher average rewards than all baseline methods and the behavior policy used for data collection. This underscores the potential of the proposed method to enhance badminton tactical decision-making and offer athletes more effective tactical recommendations. Code and data are available at https://github.com/Wenminggong/Offline_RL_for_Badminton.en_US
dcterms.abstractGraphical abstract: [Figure not available: see fulltext.]en_US
dcterms.accessRightsembargoed accessen_US
dcterms.bibliographicCitationEngineering applications of artificial intelligence, 15 Jan. 2026, v. 164, pt. B, 113395en_US
dcterms.isPartOfEngineering applications of artificial intelligenceen_US
dcterms.issued2026-01-15-
dc.identifier.scopus2-s2.0-105023710217-
dc.identifier.eissn1873-6769en_US
dc.identifier.artn113395en_US
dc.description.validate202602 bcchen_US
dc.description.oaNot applicableen_US
dc.identifier.SubFormIDG000920/2026-01-
dc.description.fundingSourceOthersen_US
dc.description.fundingTextThis work was supported by the Research Institute for Sports Science and Technology (RISports), The Hong Kong Polytechnic University.en_US
dc.description.pubStatusPublisheden_US
dc.date.embargo2028-01-15en_US
dc.description.oaCategoryGreen (AAM)en_US
dc.relation.rdatahttps://github.com/Wenminggong/Offline_RL_for_Badminton#-
Appears in Collections:Journal/Magazine Article
Open Access Information
Status embargoed access
Embargo End Date 2028-01-15
Access
View full-text via PolyU eLinks SFX Query
Show simple item record

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.