Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/105484
PIRA download icon_1.1View/Download Full Text
Title: MOTS : minimax optimal Thompson sampling
Authors: Jin, T
Xu, P
Shi, J 
Xiao, X
Gu, Q
Issue Date: 2021
Source: Proceedings of Machine Learning Research, 2021, v. 139, p. 5074-5083
Abstract: Thompson sampling is one of the most widely used algorithms in many online decision problems due to its simplicity for implementation and superior empirical performance over other state-of-the-art methods. Despite its popularity and empirical success, it has remained an open problem whether Thompson sampling can achieve the minimax optimal regret O(\sqrt{TK}) for K-armed bandit problems, where T is the total time horizon. In this paper we fill this long open gap by proposing a new Thompson sampling algorithm called MOTS that adaptively truncates the sampling result of the chosen arm at each time step. We prove that this simple variant of Thompson sampling achieves the minimax optimal regret bound O(\sqrt{TK}) for finite time horizon T and also the asymptotic optimal regret bound when T grows to infinity as well. This is the first time that the minimax optimality of multi-armed bandit problems has been attained by Thompson sampling type of algorithms.
Publisher: PMLR web site
Journal: Proceedings of Machine Learning Research
ISSN: 2640-3498
Rights: Copyright 2021 by the author(s).
Posted with permission of the author.
The following publication Tianyuan Jin, Pan Xu, Jieming Shi, Xiaokui Xiao, Quanquan Gu Proceedings of the 38th International Conference on Machine Learning, PMLR 139:5074-5083, 2021 is available at https://proceedings.mlr.press/v139/jin21d.html.
Appears in Collections:Conference Paper

Files in This Item:
File Description SizeFormat 
jin21d.pdf3.06 MBAdobe PDFView/Open
Open Access Information
Status open access
File Version Version of Record
Access
View full-text via PolyU eLinks SFX Query
Show full item record

Page views

41
Citations as of May 11, 2025

Downloads

10
Citations as of May 11, 2025

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.