%0 Journal Article %A Bosen ZENG %A Xianhua NIU %A Yong ZHONG %T Q-table initialization approach for safe exploration based on factorization machine %D 2022 %R 10.11772/j.issn.1001-9081.2021020239 %J Journal of Computer Applications %P 209-214 %V 42 %N 1 %X

In order to solve the problem that most exploration/exploitation strategies of reinforcement learning ignore the risk brought by the agent action selection with random components in exploration process, a Q-table initialization approach based on Factorization Machine (FM) was proposed for safe exploration. Firstly, the explored Q-values were introduced as prior knowledge, and then FM was used to build the model of potential interaction between states and actions in the prior knowledge. Finally, the unknown Q-values in Q-table were predicted based on this model to further guide the exploration of the agents. A/B testing was conducted in the grid reinforcement learning environment Cliffwalk of OpenAI Gym. The number of bad exploration episodes of Boltzmann and Upper Confidence Bound (UCB) exploration/exploitation strategies based on the proposed approach are reduced by 68.12% and 89.98% respectively. Experimental results show that the proposed approach improves the safety of exploration, and accelerates the convergence at the same time.

%U http://www.joca.cn/EN/10.11772/j.issn.1001-9081.2021020239