异构多智能体强化学习驱动的无人机三维避障与边缘计算协同优化

doi:10.11772/j.issn.1001-9081.2025080956

《计算机应用》唯一官方网站

• • 下一篇

异构多智能体强化学习驱动的无人机三维避障与边缘计算协同优化

陈冠良¹,刘义¹,余意²

1. 广东工业大学
2. 长沙市电子工业学校

收稿日期:2025-08-20 修回日期:2025-11-30 发布日期:2026-02-12 出版日期:2026-02-12
通讯作者: 刘义
基金资助:
6G全场景按需服务关键技术

Heterogeneous multi-agent reinforcement learning enabled co-optimization of UAV 3D obstacle avoidance and edge computing#br#

Received:2025-08-20 Revised:2025-11-30 Online:2026-02-12 Published:2026-02-12

摘要/Abstract

摘要： 摘要: 物联网与移动终端设备的激增，导致计算任务对网络时延与能耗的挑战。为此，本文研究了一种多无人机（UAV）辅助的移动边缘计算（MEC）系统，利用UAV为地面用户提供高效的计算卸载服务。在系统中，多架UAV在含复杂障碍物的三维空间中协作处理计算任务。为了最小化所有用户最大任务完成时延与系统总能耗的加权和，联合优化用户的离散卸载决策与UAV的连续三维轨迹。针对这一混合（离散-连续）优化问题，提出了基于异构多智能体深度强化学习的UOUM（User Offloading and UAV Mobility Co-optimization）算法。该算法通过构建异构多智能体框架，设计用户卸载决策为离散动作空间，UAV轨迹优化为连续动作空间，解决了混合动作空间的优化挑战；引入差分奖励机制，精确量化各智能体策略的边际贡献，并分配奖励，解决多智能体的信用分配问题；同时，融入人工势能约束，将障碍物避让要求转化为可微分的安全势能函数，确保UAV避障并提升训练效率。仿真实验结果表明，在不同测试场景下，UOUM的性能在时延、能耗和系统成本方面均优于三种对比算法（仅卸载优化算法、仅轨迹优化算法和异构多智能体强化学习算法），验证了其有效性与可靠性。UOUM算法在时延优化、能耗控制和避障安全性方面具有显著提升，展现了较强的环境适应性。

关键词: 移动边缘计算, 多智能体强化学习, 无人机避障, 任务卸载, 三维轨迹优化

Abstract: Abstract: The rapid growth of Internet of Things (IoT) and mobile terminal devices has led to significant challenges in network latency and energy consumption due to the massive computational tasks. To address this, a multi-unmanned aerial vehicle (UAV)-assisted mobile edge computing (MEC) system was studied, where UAVs provide efficient computational offloading services for ground users. In the system, multiple UAVs collaborate to process computational tasks in a three-dimensional space with complex obstacles. To minimize the weighted sum of the maximum task completion latency for all users and the total system energy consumption, a joint optimization of users' discrete offloading decisions and UAVs' continuous three-dimensional trajectories is performed. To solve this mixed (discrete-continuous) optimization problem, a heterogeneous multi-agent deep reinforcement learning-based UOUM (User Offloading and UAV Mobility Co-optimization) algorithm was proposed. The algorithm constructs a heterogeneous multi-agent framework, where user offloading decisions are designed for discrete action space, and UAV trajectory optimization is designed for continuous action space, addressing the optimization challenge of mixed action spaces. A differential reward mechanism is introduced to precisely quantify the marginal contributions of each agent's strategy and allocate rewards, solving the multi-agent credit assignment problem. Additionally, artificial potential field constraints are integrated to transform obstacle avoidance requirements into differentiable safety potential functions, ensuring UAVs avoid obstacles and improving training efficiency. Simulation results show that, in various test scenarios, UOUM outperforms three benchmark algorithms (user offloading optimization only, UAV trajectory optimization only, and heterogeneous multi-agent reinforcement learning) in terms of latency, energy consumption, and system cost, validating its effectiveness and reliability. The UOUM algorithm achieves significant improvements in latency optimization, energy control, and obstacle avoidance safety, demonstrating strong environmental adaptability.

Key words: Mobile edge computing, multi-agent reinforcement learning, UAV obstacle avoidance, task offloading, three-dimensional trajectory optimization

中图分类号:

陈冠良刘义余意. 异构多智能体强化学习驱动的无人机三维避障与边缘计算协同优化[J]. 计算机应用, DOI: 10.11772/j.issn.1001-9081.2025080956.

[1]	薛天宇, 李爱萍, 段利国. 联合任务卸载和资源优化的车辆边缘计算方案[J]. 《计算机应用》唯一官方网站, 2025, 45(6): 1766-1775.
[2]	项钰斐, 倪郑威. 基于演化博弈的分层联邦学习边缘联合动态分析[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1077-1085.
[3]	卫琳, 张世豪, 和孟佯. 面向算力网络的工作流任务优化与节能卸载方法[J]. 《计算机应用》唯一官方网站, 2025, 45(12): 3916-3924.
[4]	孙鉴, 张伟, 马宝全, 吴隹伟, 杨晓焕, 武涛. 无人机群辅助的移动感知自适应并行计算任务卸载系统MATOS[J]. 《计算机应用》唯一官方网站, 2025, 45(10): 3259-3269.
[5]	张俊娜, 王欣新, 李天泽, 赵晓焱, 袁培燕. 基于动态服务缓存辅助的任务卸载方法[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1493-1500.
[6]	赵晓焱, 韩威, 张俊娜, 袁培燕. 基于异步深度强化学习的车联网协作卸载策略[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1501-1510.
[7]	李校林, 江雨桑. 无人机辅助移动边缘计算中的任务卸载算法[J]. 《计算机应用》唯一官方网站, 2023, 43(6): 1893-1899.
[8]	黄晓辉, 杨凯铭, 凌嘉壕. 基于共享注意力的多智能体强化学习订单派送[J]. 《计算机应用》唯一官方网站, 2023, 43(5): 1620-1624.
[9]	刘炎培, 陈宁宁, 朱运静, 王丽萍. 面向5G/Beyond 5G的移动边缘缓存优化技术综述[J]. 《计算机应用》唯一官方网站, 2022, 42(8): 2487-2500.
[10]	王界钦, 林士飏, 彭世明, 贾硕, 杨苗会. 协同移动边缘计算分层资源配置机制[J]. 《计算机应用》唯一官方网站, 2022, 42(8): 2501-2510.
[11]	邓世权, 叶绪国. 基于深度Q网络的多目标任务卸载算法[J]. 《计算机应用》唯一官方网站, 2022, 42(6): 1668-1674.
[12]	袁景凌, 毛慧华, 王娜娜, 向尧. 移动边缘计算中资源受限的动态服务部署策略[J]. 《计算机应用》唯一官方网站, 2022, 42(6): 1662-1667.
[13]	李余, 何希平, 唐亮贵. 基于终端直通通信的多用户计算卸载资源优化决策[J]. 《计算机应用》唯一官方网站, 2022, 42(5): 1538-1546.
[14]	曾续玲, 李陶深, 巩健, 杜利俊. 无线供能移动边缘计算系统的安全卸载优化[J]. 《计算机应用》唯一官方网站, 2022, 42(4): 1216-1224.
[15]	王亚丽, 陈家超, 张俊娜. 移动边缘计算中收益最大化的缓存协作策略[J]. 《计算机应用》唯一官方网站, 2022, 42(11): 3479-3485.

异构多智能体强化学习驱动的无人机三维避障与边缘计算协同优化

Heterogeneous multi-agent reinforcement learning enabled co-optimization of UAV 3D obstacle avoidance and edge computing#br#

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics