Journal of Computer Applications ›› 2025, Vol. 45 ›› Issue (11): 3642-3648.DOI: 10.11772/j.issn.1001-9081.2024111589

• Advanced computing • Previous Articles    

Capacitated vehicle routing problem solving method based on improved MAML and GVAE

Yanpeng ZHANG1,2, Yuqian ZHAO1,2(), Fan ZHANG1,2, Tenghai QIU1,3, Gui GUI1,2, Lingli YU1,2   

  1. 1.School of Automation,Central South University,Changsha Hunan 410083,China
    2.Key Laboratory of Industrial Intelligence and Systems,Ministry of Education (Central South University),Changsha Hunan 410083,China
    3.Institute of Automation,Chinese Academy of Sciences,Beijing 100190,China
  • Received:2024-11-11 Revised:2025-02-14 Accepted:2025-02-20 Online:2025-02-26 Published:2025-11-10
  • Contact: Yuqian ZHAO
  • About author:ZHANG Yanpeng, born in 1999, M. S. candidate. His research interests include deep reinforcement learning, combinatorial optimization.
    ZHANG Fan, born in 1989, Ph. D., associate professor. His research interests include image processing, laser manufacturing.
    QIU Tenghai, born in 1991, Ph. D. candidate. His research interests include collective intelligence decision-making, multi-agent reinforcement learning.
    GUI Gui, born in 1979, Ph. D., professor. Her research interests include artificial intelligence, big data models, intelligent transportation systems.
    YU Lingli, born in 1983, Ph. D., professor. Her research interests include intelligent vehicle path planning, navigation control.
  • Supported by:
    National Key Research and Development Program of China(2022YFE0112300);Key Research and Development Program of Hunan Province(2023GK2021)

基于改进MAML与GVAE的容量约束车辆路径问题求解方法

张焱鹏1,2, 赵于前1,2(), 张帆1,2, 丘腾海1,3, 桂瑰1,2, 余伶俐1,2   

  1. 1.中南大学 自动化学院,长沙 410083
    2.工业智能与系统教育部重点实验室(中南大学),长沙 410083
    3.中国科学院 自动化研究所,北京 100190
  • 通讯作者: 赵于前
  • 作者简介:张焱鹏(1999—),男,黑龙江齐齐哈尔人,硕士研究生,主要研究方向:深度强化学习、组合优化
    张帆(1989—),男,副教授,博士,主要研究方向:图像处理、激光制造
    丘腾海(1991—),男,博士研究生,主要研究方向:群体智能决策、多智能体强化学习
    桂瑰(1979—),女,教授,博士,主要研究方向:人工智能、大数据模型、智能交通系统
    余伶俐(1983—),女,教授,博士,主要研究方向:智能车辆路径规划、导航控制。
  • 基金资助:
    国家重点研发计划项目(2022YFE0112300);湖南省重点研发计划项目(2023GK2021);湖南省重点研发计划项目(2024JK2028)

Abstract:

Deep Reinforcement Learning (DRL)-based vehicle routing planning methods have garnered significant attention for their rapid solving speed and end-to-end processing capabilities. However, most existing methods are limited to scenarios with uniformly distributed nodes and fixed node numbers, demonstrating performance degradation when handling unevenly distributed nodes or varying numbers of nodes. To address this issue, a meta-learning framework based on improved Model-Agnostic Meta-Learning (MAML) and Graph Variational AutoEncoder (GVAE) was proposed to obtain a well-initialized model through meta-training, and perform quick fine-tuning for out-of-distribution tasks, improving the model's generalization performance. Besides, a GVAE was employed for initializing parameters of the meta-learning framework to further enhance the effect of meta-learning. Experimental results show that the proposed method can handle Vehicle Routing Problems (VRPs) with different node distributions, performs well when dealing with varying numbers of nodes. The average gap across five tasks reduced by 0.45 percentage points compared to the method that does not use meta-learning. It can be seen that the proposed meta-learning framework enhances the effect of reinforcement learning, achieves comparable solution quality to state-of-the-art solvers while significantly shortening computation time.

Key words: Vehicle Routing Problem (VRP), Deep Reinforcement Learning (DRL), meta-learning, Graph Variational AutoEncoder (GVAE), combinatorial optimization, policy gradient method

摘要:

基于深度强化学习(DRL)的车辆路径规划方法以其求解速度快、端到端等优势受到广泛关注,但现有方法大多局限于对节点分布均匀和数量固定问题的求解,当面临节点不平均分布以及节点数变化的情况时,求解效果有所下降。针对这一问题,提出一种基于改进模型无关的元学习(MAML)和图变分自编码器(GVAE)的元学习框架,旨在通过元训练得到一个良好的初始化模型,并针对数据集外分布的任务进行快速微调,从而提升模型的泛化性能;此外利用GVAE初始化元学习框架的参数,以进一步提升元学习效果。实验结果表明,所提方法可以较好地处理不同节点分布情况下的车辆路径问题(VRP),在面对不同节点数量问题时也有较好的表现,在5种任务上的平均偏差率较未使用元学习的方法降低了0.45个百分点。利用元学习框架可有效提升强化学习的效果,与先进求解器相比,所提框架在保证成本接近的前提下可有效缩短求解时间。

关键词: 车辆路径问题, 深度强化学习, 元学习, 图变分自编码器, 组合优化, 策略梯度方法

CLC Number: