Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Best action identification of tree structure based on ternary multi-arm bandit
LIU Guoqing, WANG Jieting, HU Zhiguo, QIAN Yuhua
Journal of Computer Applications    2019, 39 (8): 2252-2260.   DOI: 10.11772/j.issn.1001-9081.2018112394
Abstract469)      PDF (1397KB)(328)       Save
Monte Carlo Tree Search (MCTS) shows excellent performance in chess game problem. Most existing studies only consider the success and failure feedbacks and assum that the results follow the Bernoulli distribution. However, this setting ignores the usual result of draw, causing inaccurate assessment of the disk status and missing of optimal action. In order to solve this problem, Ternary Multi-Arm Bandit (TMAB) model was constructed and Best Arm identification of TMAB (TBBA) algorithm was proposed. Then, TBBA algorithm was applied to Ternary Minimax Sampling Tree (TMST). Finally, TBBA_tree algorithm based on the simple iteration of TBBA and Best Action identification of TMST (TTBA) algorithm based on transforming the tree structure into TMAB were proposed. In the experiments, two arm spaces with different precision were established, and several comparative TMABs and TMSTs were constructed based on the two arm spaces. Experimental results show that compared to the accuracy of uniform sampling algorithm, the accuracy of TBBA algorithm keeps rising steadily and can reach 100% partially, and the accuracy of TBBA algorithm is basically more than 80% with good generalization and stability and without outliers or fluctuation ranges.
Reference | Related Articles | Metrics