Journal of Computer Applications ›› 2024, Vol. 44 ›› Issue (8): 2626-2633.DOI: 10.11772/j.issn.1001-9081.2023081120

• Frontier and comprehensive applications • Previous Articles     Next Articles

Multi-robot path following and formation based on deep reinforcement learning

Haodong HE1, Hao FU1,2(), Qiang WANG1, Shuai ZHOU1, Wei LIU1   

  1. 1.School of Computer Science and Technology,Wuhan University of Science and Technology,Wuhan Hubei 430081,China
    2.Hubei Key Laboratory of Digital Textile Equipment,Wuhan Hubei 430200,China
  • Received:2023-08-22 Revised:2023-11-16 Accepted:2023-11-24 Online:2023-12-18 Published:2024-08-10
  • Contact: Hao FU
  • About author:HE Haodong, born in 1997, M. S. candidate. His research interests include multi-robot intelligent control, reinforcement learning.
    WANG Qiang, born in 1995, M. S. candicate. His research interests include multi-robot intelligent control, artificial intelligence.
    ZHOU Shuai, born in 2000, M. S. candidate. His research interests include offline reninforcement learning, intelligent robot.
    LIU Wei, born in 1998, M. S. candidate. His research interests include multi-robot intelligent control.
  • Supported by:
    National Natural Science Foundation of China(62173262);Scientific Research Project of Education Department of Hubei Province(B2021020);Knowledge Innovation Special Project of Wuhan(2022010801020315);Hubei Key Laboratory of Digital Textile Equipment(KDTL2022002);Hubei Provincial Advantaged Characteristic Disciplines (Groups) Project of Wuhan University of Science and Technology(2023D031)

基于深度强化学习的多机器人路径跟随与编队

何浩东1, 符浩1,2(), 王强1, 周帅1, 刘伟1   

  1. 1.武汉科技大学 计算机科学与技术学院,武汉 430081
    2.湖北省数字化纺织装备重点实验室,武汉 430200
  • 通讯作者: 符浩
  • 作者简介:何浩东(1997—),男,四川巴中人,硕士研究生,主要研究方向:多机器人智能控制、强化学习
    符浩(1988—),男,湖南益阳人,讲师,博士,主要研究方向:多机器人系统、强化学习 fuhao@wust.edu.cn
    王强(1995—),男,重庆人,硕士研究生,主要研究方向:多机器人智能控制、人工智能
    周帅(2000—),男,湖北天门人,硕士研究生,主要研究方向:离线强化学习、智能机器人
    刘伟(1998—),男,湖北黄冈人,硕士研究生,主要研究方向:多机器人智能控制。
  • 基金资助:
    国家自然科学基金资助项目(62173262);湖北省教育厅科研项目(B2021020);武汉市知识创新专项(2022010801020315);湖北省数字化纺织装备重点实验室开放课题(KDTL2022002);武汉科技大学湖北省优势特色学科(群)项目(2023D031)

Abstract:

Aiming at the obstacle avoidance and trajectory smoothness problem of multi-robot path following and formation in crowd environment, a multi-robot path following and formation algorithm based on deep reinforcement learning was proposed. Firstly, a pedestrian danger priority mechanism was established, which was combined with reinforcement learning to design a danger awareness network to enhance the safety of multi-robot formation. Subsequently, a virtual robot was introduced as the reference target for multiple robots, thus transforming path following into tracking control of the virtual robot by the multiple robots, with the purpose of enhancing the smoothness of the robot trajectories. Finally, quantitative and qualitative analysis was conducted through simulation experiments to compare the proposed algorithm with existing ones. The experimental results show that compared with the existing point-to-point path following algorithms, the proposed algorithm has excellent obstacle avoidance performance in crowd environments, which ensures the smoothness of multi-robot motion trajectories.

Key words: multi-robot, path-following, formation obstacle-avoiding, reinforcement learning, crowd environment

摘要:

针对多机器人在人群环境中路径跟随与编队的避障及运动轨迹平滑性问题,提出基于深度强化学习的多机器人路径跟随与编队算法。首先,建立行人危险性优先级机制,结合行人危险性优先级机制与强化学习设计危险意识网络,提高多机器人编队的安全性;然后,引入虚拟机器人作为多机器人的跟随目标,将路径跟随转化为多机器人对虚拟机器人的跟随控制,提高机器人运动轨迹的平滑性;最后,通过仿真实验将所提算法与现有算法进行对比,同时进行定量与定性分析。实验结果表明,与现有点对点的路径跟随算法相比,所提算法在人群环境下具有优异的避障性能,可保证多机器人运动轨迹的平滑性。

关键词: 多机器人, 路径跟随, 编队避障, 强化学习, 人群环境

CLC Number: