Journal of Computer Applications ›› 2015, Vol. 35 ›› Issue (12): 3491-3496.DOI: 10.11772/j.issn.1001-9081.2015.12.3491

• Artificial intelligence • Previous Articles     Next Articles

Multi-Agent path planning algorithm based on hierarchical reinforcement learning and artificial potential field

ZHENG Yanbin1,2, LI Bo1, AN Deyu1, LI Na1   

  1. 1. College of Computer and Information Engineering, Henan Normal University, Xinxiang Henan 453007, China;
    2. Henan Engineering Laboratory of Intellectual Business and Internet of Things Technologies, Xinxiang Henan 453007, China
  • Received:2015-06-15 Revised:2015-07-10 Online:2015-12-10 Published:2015-12-10

基于分层强化学习及人工势场的多Agent路径规划方法

郑延斌1,2, 李波1, 安德宇1, 李娜1   

  1. 1. 河南师范大学计算机与信息工程学院, 河南新乡 453007;
    2. 智慧商务与物联网技术河南省工程实验室, 河南新乡 453007
  • 通讯作者: 李波(1989-),男,河南开封人,硕士研究生,主要研究方向:虚拟现实、多智能体系统
  • 作者简介:郑延斌(1964-),男,河南内乡人,教授,博士,主要研究方向:虚拟现实、多智能体系统、对策论;安德宇(1990-),女,河南新乡人,硕士研究生,主要研究方向:虚拟现实、多智能体系统;李娜(1992-),女,河南新乡人,硕士研究生,主要研究方向:虚拟现实。
  • 基金资助:
    河南省重点科技攻关项目(132102210537,132102210538)。

Abstract: Aiming at the problems of the path planning algorithm, such as slow convergence and low efficiency, a multi-Agent path planning algorithm based on hierarchical reinforcement learning and artificial potential field was proposed. Firstly, the multi-Agent operating environment was regarded as an artificial potential field, the potential energy of every point, which represented the maximal rewards obtained according to the optimal strategy, was determined by the priori knowledge. Then, the update process of strategy was limited to smaller local space or lower dimension of high-level space to enhance the performance of learning algorithm by using model learning without environment and partial update of hierarchical reinforcement learning. Finally, aiming at the problem of taxi, the simulation experiment of the proposed algorithm was done in grid environment. To close to the real environment and increase the portability of the algorithm, the proposed algorithm was verified in three-dimensional simulation environment. The experimental results show that the convergence speed of the algorithm is fast, and the convergence procedure is stable.

Key words: path planning, Multi-Agent System (MAS), hierarchical reinforcement learning, artificial potential field, priori knowledge

摘要: 针对路径规划算法收敛速度慢及效率低的问题,提出了一种基于分层强化学习及人工势场的多Agent路径规划算法。首先,将多Agent的运行环境虚拟为一个人工势能场,根据先验知识确定每点的势能值,它代表最优策略可获得的最大回报;其次,利用分层强化学习方法的无环境模型学习以及局部更新能力将策略更新过程限制在规模较小的局部空间或维度较低的高层空间上,提高学习算法的性能;最后,针对出租车问题在栅格环境中对所提算法进行了仿真实验。为了使算法贴近真实环境,增加算法的可移植性,在三维仿真环境中对该算法进行验证,实验结果表明该算法收敛速度快,收敛过程稳定。

关键词: 路径规划, 多智能体系统, 分层强化学习, 人工势场, 先验知识

CLC Number: