Please use this identifier to cite or link to this item:
http://hdl.handle.net/10397/105484
Title: | MOTS : minimax optimal Thompson sampling | Authors: | Jin, T Xu, P Shi, J Xiao, X Gu, Q |
Issue Date: | 2021 | Source: | Proceedings of Machine Learning Research, 2021, v. 139, p. 5074-5083 | Abstract: | Thompson sampling is one of the most widely used algorithms in many online decision problems due to its simplicity for implementation and superior empirical performance over other state-of-the-art methods. Despite its popularity and empirical success, it has remained an open problem whether Thompson sampling can achieve the minimax optimal regret O(\sqrt{TK}) for K-armed bandit problems, where T is the total time horizon. In this paper we fill this long open gap by proposing a new Thompson sampling algorithm called MOTS that adaptively truncates the sampling result of the chosen arm at each time step. We prove that this simple variant of Thompson sampling achieves the minimax optimal regret bound O(\sqrt{TK}) for finite time horizon T and also the asymptotic optimal regret bound when T grows to infinity as well. This is the first time that the minimax optimality of multi-armed bandit problems has been attained by Thompson sampling type of algorithms. | Publisher: | PMLR web site | Journal: | Proceedings of Machine Learning Research | ISSN: | 2640-3498 | Rights: | Copyright 2021 by the author(s). Posted with permission of the author. The following publication Tianyuan Jin, Pan Xu, Jieming Shi, Xiaokui Xiao, Quanquan Gu Proceedings of the 38th International Conference on Machine Learning, PMLR 139:5074-5083, 2021 is available at https://proceedings.mlr.press/v139/jin21d.html. |
Appears in Collections: | Conference Paper |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
jin21d.pdf | 3.06 MB | Adobe PDF | View/Open |
Page views
41
Citations as of May 11, 2025
Downloads
10
Citations as of May 11, 2025

Google ScholarTM
Check
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.