Journal of Computer Applications
Next Articles
Received:
Revised:
Accepted:
Online:
Published:
张焱鹏1,赵于前1,张帆2,丘腾海1,桂瑰1,余伶俐1
通讯作者:
基金资助:
Abstract: Vehicle routing methods based on Deep Reinforcement Learning (DRL) have garnered widespread attention for their rapid computation and end-to-end processing. However, most existing methods are limited to solving problems with uniformly distributed nodes and fixed node quantities, demonstrating performance degradation when faced with unevenly distributed nodes or varying numbers of nodes. To address this issue, an meta-learning framework based on improved MAML (Model-Agnostic Meta-Learning) and graph variational autoencoder was proposed to obtain a well-initialized model through meta-training, enabling quick fine-tune for out-of-distribution tasks, thereby improving the model's generalization performance. To further enhance the effectiveness of meta-learning, a graph variational autoencoder was employed for initializing the parameters of the meta-learning framework. Experimental results show that the proposed method can handle vehicle routing problems with different node distributions and performs well when dealing with varying numbers of nodes. The average gap across five tasks is reduced by 0.45 compared to the method that does not use meta-learning. The meta-learning framework enhances the effectiveness of reinforcement learning, and compared to excellent solvers, it significantly reduces solving time while maintaining similar costs, demonstrating the effectiveness of the proposed method.
Key words: Vehicle Routing Problem (VRP), Deep Reinforcement Learning (DRL), meta-learning, graph variational autoencoder, combinatorial optimization, policy gradient method
摘要: 基于深度强化学习的车辆路径规划方法以其求解速度快、端到端等优势受到广泛关注,但现有方法大多局限于对节点分布均匀和数量固定问题的求解,当面临节点不平均分布以及节点数量变化的情况时,求解效果有所下降。针对这一问题,提出一种基于改进MAML(Model-Agnostic Meta-Learning)和图变分自编码器的元学习框架,旨在通过元训练得到一个良好的初始化模型,并针对数据集外分布任务进行快速微调,以提升模型的泛化性能。为进一步提升元学习效果,使用图变分自编码器进行元学习框架的参数初始化。实验表明,所提方法可以较好地处理不同节点分布情况的车辆路径问题,在面对不同节点数量问题时也有较好的表现,在五种任务上的平均偏差率较未使用元学习方法降低了0.45。元学习框架提升了强化学习的效果,与先进求解器相比,在保证成本接近的前提下可有效缩短求解时间,表明了方法的有效性。
关键词: 车辆路径问题, 深度强化学习, 元学习, 图变分自编码器, 组合优化, 梯度策略方法
CLC Number:
TP181
张焱鹏 赵于前 张帆 丘腾海 桂瑰 余伶俐. 基于改进MAML与图变分自编码器的容量约束车辆路径问题求解方法[J]. 《计算机应用》唯一官方网站, DOI: 10.11772/j.issn.1001-9081.2024111589.
0 / Recommend
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2024111589