《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (10): 3290-3296.DOI: 10.11772/j.issn.1001-9081.2022091464

• 前沿与综合应用 • 上一篇    

基于值函数迭代的持续监测无人机路径规划

刘晨1,2, 陈洋1,2(), 符浩3   

  1. 1.武汉科技大学 机器人与智能系统研究院,武汉 430081
    2.冶金自动化与检测技术教育部工程研究中心(武汉科技大学),武汉 430081
    3.武汉科技大学 计算机科学与技术学院,武汉 430081
  • 收稿日期:2022-09-30 修回日期:2023-01-13 接受日期:2023-01-15 发布日期:2023-03-02 出版日期:2023-10-10
  • 通讯作者: 陈洋
  • 作者简介:刘晨(1998—),男,湖北洪湖人,硕士研究生,主要研究方向:机器人导航与路径规划
    符浩(1988—),男,湖南桃江人,讲师,博士,主要研究方向:多机器人强化学习。
  • 基金资助:
    国家自然科学基金资助项目(62173262)

UAV path planning for persistent monitoring based on value function iteration

Chen LIU1,2, Yang CHEN1,2(), Hao FU3   

  1. 1.Institute of Robotics and Intelligent Systems,Wuhan University of Science and Technology,Wuhan Hubei 430081,China
    2.Engineering Research Center for Metallurgical Automation and Measurement Technology of Ministry of Education (Wuhan University of Science and Technology),Wuhan Hubei 430081,China
    3.School of Computer Science and Technology,Wuhan University of Science and Technology,Wuhan Hubei 430081,China
  • Received:2022-09-30 Revised:2023-01-13 Accepted:2023-01-15 Online:2023-03-02 Published:2023-10-10
  • Contact: Yang CHEN
  • About author:LIU Chen, born in 1998, M. S. candidate. His research interests include robot navigation and path planning.
    FU Hao, born in 1988, Ph. D., lecturer. His research interests include multi-robot reinforcement learning.
  • Supported by:
    National Natural Science Foundation of China(62173262)

摘要:

使用无人机(UAV)持续监测指定区域可以起到威慑入侵破坏、及时发现异常等作用,然而固定的监测规律容易被入侵者发现,因此需要设计UAV飞行路径的随机算法。针对以上问题,提出一种基于值函数迭代(VFI)的UAV持续监测路径规划算法。首先,合理选择监测目标点的状态,并分析各监测节点的剩余时间;其次,结合奖励/惩罚收益和路径安全性约束构建该监测目标点对应状态的值函数,在VFI算法过程中基于ε原则和轮盘选择随机选择下一节点;最后,以所有状态的值函数增长趋于饱和为目标,求解UAV持续监测路径。仿真实验结果表明,所提算法获得的信息熵为0.905 0,VFI运行时间为0.363 7 s,相较于传统蚁群算法(ACO),所提算法的信息熵提升了216%,运行时间降低了59%,随机性与快速性均有所提升,验证了具有随机性的UAV飞行路径对提高持续监测效率具有重要意义。

关键词: 路径规划, 持续监测, 值迭代, 轮盘选择, ε原则

Abstract:

The use of Unmanned Aerial Vehicle (UAV) to continuously monitor designated areas can play a role in deterring invasion and damage as well as discovering abnormalities in time, but the fixed monitoring rules are easy to be discovered by the invaders. Therefore, it is necessary to design a random algorithm for UAV flight path. In view of the above problem, a UAV persistent monitoring path planning algorithm based on Value Function Iteration (VFI) was proposed. Firstly, the state of the monitoring target point was selected reasonably, and the remaining time of each monitoring node was analyzed. Secondly, the value function of the corresponding state of this monitoring target point was constructed by combining the reward/penalty benefit and the path security constraint. In the process of the VFI algorithm, the next node was selected randomly based on ε principle and roulette selection. Finally, with the goal that the growth of the value function of all states tends to be saturated, the UAV persistent monitoring path was solved. Simulation results show that the proposed algorithm has the obtained information entropy of 0.905 0, and the VFI running time of 0.363 7 s. Compared with the traditional Ant Colony Optimization (ACO), the proposed algorithm has the information entropy increased by 216%, and the running time decreased by 59%,both randomness and rapidity have been improved. It is verified that random UAV flight path is of great significance to improve the efficiency of persistent monitoring.

Key words: path planning, persistent monitoring, value iteration, roulette selection, ε principle

中图分类号: