Capacitated Vehicle routing problem solving method based on improved MAML and graph variational autoencoder

doi:10.11772/j.issn.1001-9081.2024111589

Journal of Computer Applications

Received:2024-11-11 Revised:2025-02-14 Accepted:2025-02-20 Online:2025-02-26 Published:2025-02-26

基于改进MAML与图变分自编码器的容量约束车辆路径问题求解方法

张焱鹏¹,赵于前¹,张帆²,丘腾海¹,桂瑰¹,余伶俐¹

1. 中南大学
2. 中南大学自动化学院

通讯作者: 赵于前
基金资助:
突发公共卫生事件下的医疗资源供给与配置模式研究;基于大数据技术的凿岩钻机自动作业系统

Abstract

Abstract: Vehicle routing methods based on Deep Reinforcement Learning (DRL) have garnered widespread attention for their rapid computation and end-to-end processing. However, most existing methods are limited to solving problems with uniformly distributed nodes and fixed node quantities, demonstrating performance degradation when faced with unevenly distributed nodes or varying numbers of nodes. To address this issue, an meta-learning framework based on improved MAML (Model-Agnostic Meta-Learning) and graph variational autoencoder was proposed to obtain a well-initialized model through meta-training, enabling quick fine-tune for out-of-distribution tasks, thereby improving the model's generalization performance. To further enhance the effectiveness of meta-learning, a graph variational autoencoder was employed for initializing the parameters of the meta-learning framework. Experimental results show that the proposed method can handle vehicle routing problems with different node distributions and performs well when dealing with varying numbers of nodes. The average gap across five tasks is reduced by 0.45 compared to the method that does not use meta-learning. The meta-learning framework enhances the effectiveness of reinforcement learning, and compared to excellent solvers, it significantly reduces solving time while maintaining similar costs, demonstrating the effectiveness of the proposed method.

Key words: Vehicle Routing Problem (VRP), Deep Reinforcement Learning (DRL), meta-learning, graph variational autoencoder, combinatorial optimization, policy gradient method

摘要： 基于深度强化学习的车辆路径规划方法以其求解速度快、端到端等优势受到广泛关注，但现有方法大多局限于对节点分布均匀和数量固定问题的求解，当面临节点不平均分布以及节点数量变化的情况时，求解效果有所下降。针对这一问题，提出一种基于改进MAML(Model-Agnostic Meta-Learning)和图变分自编码器的元学习框架，旨在通过元训练得到一个良好的初始化模型，并针对数据集外分布任务进行快速微调，以提升模型的泛化性能。为进一步提升元学习效果，使用图变分自编码器进行元学习框架的参数初始化。实验表明，所提方法可以较好地处理不同节点分布情况的车辆路径问题，在面对不同节点数量问题时也有较好的表现，在五种任务上的平均偏差率较未使用元学习方法降低了0.45。元学习框架提升了强化学习的效果，与先进求解器相比，在保证成本接近的前提下可有效缩短求解时间，表明了方法的有效性。

关键词: 车辆路径问题, 深度强化学习, 元学习, 图变分自编码器, 组合优化, 梯度策略方法

CLC Number:

TP181

张焱鹏赵于前张帆丘腾海桂瑰余伶俐. 基于改进MAML与图变分自编码器的容量约束车辆路径问题求解方法[J]. 《计算机应用》唯一官方网站, DOI: 10.11772/j.issn.1001-9081.2024111589.

[1]	Yiqin YAN, Chuan LUO, Tianrui LI, Hongmei CHEN. Cross-domain few-shot classification model based on relation network and Vision Transformer [J]. Journal of Computer Applications, 2025, 45(4): 1095-1103.
[2]	Huahua WANG, Liang HUANG, Jiajie CHEN, Jiening FANG. Dynamic allocation algorithm for multi-beam subcarriers of low orbit satellites based on deep reinforcement learning [J]. Journal of Computer Applications, 2025, 45(2): 571-577.
[3]	Kun FU, Shicong YING, Tingting ZHENG, Jiajie QU, Jingyuan CUI, Jianwei LI. Graph data augmentation method for few-shot node classification [J]. Journal of Computer Applications, 2025, 45(2): 392-402.
[4]	Jing WANG, Xuming FANG. Intelligent joint power and channel allocation algorithm for Wi-Fi7 multi-link integrated communication and sensing [J]. Journal of Computer Applications, 2025, 45(2): 563-570.
[5]	Zijun MIAO, Fei LUO, Weichao DING, Wenbo DONG. Traffic signal control algorithm based on overall state prediction and fair experience replay [J]. Journal of Computer Applications, 2025, 45(1): 337-344.
[6]	Yunchuan HUANG, Yongquan JIANG, Juntao HUANG, Yan YANG. Molecular toxicity prediction based on meta graph isomorphism network [J]. Journal of Computer Applications, 2024, 44(9): 2964-2969.
[7]	Yi ZHOU, Hua GAO, Yongshen TIAN. Proximal policy optimization algorithm based on clipping optimization and policy guidance [J]. Journal of Computer Applications, 2024, 44(8): 2334-2341.
[8]	Tian MA, Runtao XI, Jiahao LYU, Yijie ZENG, Jiayi YANG, Jiehui ZHANG. Mobile robot 3D space path planning method based on deep reinforcement learning [J]. Journal of Computer Applications, 2024, 44(7): 2055-2064.
[9]	Yan LI, Dazhi PAN, Siqing ZHENG. Improved adaptive large neighborhood search algorithm for multi-depot vehicle routing problem with time window [J]. Journal of Computer Applications, 2024, 44(6): 1897-1904.
[10]	Zhihao WU, Ziqiu CHI, Ting XIAO, Zhe WANG. Meta-learning adaption for few-shot text-to-speech [J]. Journal of Computer Applications, 2024, 44(5): 1629-1635.
[11]	Xuanfeng LI, Shengcai LIU, Ke TANG. Novel genetic algorithm for solving chance-constrained multiple-choice Knapsack problems [J]. Journal of Computer Applications, 2024, 44(5): 1378-1385.
[12]	Xiaoyan ZHAO, Wei HAN, Junna ZHANG, Peiyan YUAN. Collaborative offloading strategy in internet of vehicles based on asynchronous deep reinforcement learning [J]. Journal of Computer Applications, 2024, 44(5): 1501-1510.
[13]	Rui TANG, Chuanlin PANG, Ruizhi ZHANG, Chuan LIU, Shibo YUE. DDPG-based resource allocation in D2D communication-empowered cellular network [J]. Journal of Computer Applications, 2024, 44(5): 1562-1569.
[14]	Wangjun SHI, Jing WANG, Xiaojun NING, Youfang LIN. Sleep stage classification model by meta transfer learning in few-shot scenarios [J]. Journal of Computer Applications, 2024, 44(5): 1445-1451.
[15]	Jianqiang LI, Zhou HE. Hybrid NSGA-Ⅱ for vehicle routing problem with multi-trip pickup and delivery [J]. Journal of Computer Applications, 2024, 44(4): 1187-1194.