Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (5): 1620-1624.DOI: 10.11772/j.issn.1001-9081.2022040630

Special Issue: 前沿与综合应用

• Frontier and comprehensive applications • Previous Articles     Next Articles

Order dispatching by multi-agent reinforcement learning based on shared attention

Xiaohui HUANG, Kaiming YANG(), Jiahao LING   

  1. School of Information Engineering,East China Jiaotong University,Nanchang Jiangxi 330013,China
  • Received:2022-05-06 Revised:2022-07-11 Accepted:2022-07-13 Online:2022-08-05 Published:2023-05-10
  • Contact: Kaiming YANG
  • About author:HUANG Xiaohui, born in 1984, Ph. D., associate professor. His research interests include deep learning, intelligent transportation.
    YANG Kaiming, born in 1996, M. S. candidate. His research interests include deep reinforcement learning, intelligent transportation.
    LING Jiahao, born in 1999, M. S. candidate. His research interests include deep reinforcement learning, intelligent transportation.
  • Supported by:
    National Natural Science Foundation of China(62062033);Natural Science Foundation of Jiangxi Province(20212BAB202008)


黄晓辉, 杨凯铭(), 凌嘉壕   

  1. 华东交通大学 信息工程学院,南昌 330013
  • 通讯作者: 杨凯铭
  • 作者简介:黄晓辉(1984—),男,江西上高人,副教授,博士,CCF会员,主要研究方向:深度学习、智慧交通
  • 基金资助:


Ride-hailing has become a popular choice for people to travel due to its convenience and speed, how to efficiently dispatch the appropriate orders to deliver passengers to the destination is a research hotspot today. Many researches focus on training a single agent, which then uniformly distributies orders, without the vehicle itself being involved in the decision making. To solve the above problem, a multi-agent reinforcement learning algorithm based on shared attention, named SARL (Shared Attention Reinforcement Learning), was proposed. In the algorithm, the order dispatching problem was modeled as a Markov decision process, and multi-agent reinforcement learning was used to make each agent become a decision-maker through centralized training and decentralized execution. Meanwhile, the shared attention mechanism was added to make the agents share information and cooperate with each other. Comparison experiments with Random matching (Random), Greedy algorithm (Greedy), Individual Deep-Q-Network (IDQN) and Q-learning MIXing network (QMIX) were conducted under different map scales, different number of passengers and different number of vehicles. Experimental results show that the SARL algorithm achieves optimal time efficiency in three different scale maps (100×100, 10×10 and 500×500) for fixed and variable vehicle and passenger combinations, which verifies the generalization performance and stable performance of the SARL algorithm. The SARL algorithm can optimize the matching of vehicles and passengers, reduce the waiting time of passengers and improve the satisfaction of passengers.

Key words: machine learning, deep reinforcement learning, attention mechanism, multi-agent reinforcement learning, vehicle order dispatching



关键词: 机器学习, 深度强化学习, 注意力机制, 多智能体强化学习, 车辆订单派送

CLC Number: