Journal of Computer Applications ›› 2025, Vol. 45 ›› Issue (12): 3947-3956.DOI: 10.11772/j.issn.1001-9081.2024111677

• Network and communications • Previous Articles     Next Articles

Secure and reliable service function chain deployment based on encoder-decoder structured reinforcement learning

Xiang KUANG1, Zhen MA2,3(), Wanchun ZHU1, Zhi ZHANG1, Yunfei CUI1   

  1. 1.College of Information Engineering,Guiyang Institute of Information Science and Technology,Guiyang Guizhou 550025,China
    2.College of Intelligent Engineering,Guiyang Institute of Information Science and Technology,Guiyang Guizhou 550025,China
    3.College of Big Data and Information Engineering,Guizhou University,Guiyang Guizhou 550025,China
  • Received:2024-11-29 Revised:2025-03-29 Accepted:2025-03-31 Online:2025-04-08 Published:2025-12-10
  • Contact: Zhen MA
  • About author:KUANG Xiang, born in 1991, lecturer. His research interests include network optimization,deep learning, system maintenance.
    MA Zhen, born in 1990, Ph. D. candidate, associate professor. His research interests include deep learning, artificial intelligence, autonomous driving.
    ZHU Wanchun, born in 1978, Ph. D., professor. His research interests include big data.
    ZHANG Zhi, born in 1978, M. S., lecturer. His research interests include reinforcement learning, artificial intelligence.
    CUI Yunfei, born in 1990, lecturer. His research interests include cloud technology, computer network.
  • Supported by:
    University-Industry Collaborative Education Program of of Ministry of Education(220700595294923)

基于编解码结构强化学习的安全可靠服务功能链部署

况翔1, 马震2,3(), 朱万春1, 张智1, 崔云飞1   

  1. 1.贵阳信息科技学院 信息工程学院,贵阳 550025
    2.贵阳信息科技学院 智能工程学院,贵阳 550025
    3.贵州大学 大数据与信息工程学院,贵阳 550025
  • 通讯作者: 马震
  • 作者简介:况翔(1991—),男,贵州赫章人,讲师,主要研究方向:网络优化、深度学习、系统运维
    马震(1990—),男,河南新乡人,副教授,博士研究生,主要研究方向:深度学习、人工智能、自动驾驶
    朱万春(1978—),男,贵州晴隆人,教授,博士,主要研究方向:大数据
    张智(1978—),男,贵州遵义人,讲师,硕士,主要研究方向:强化学习、人工智能
    崔云飞(1990—),男,河北张家口人,讲师,主要研究方向:云技术、计算机网络。
  • 基金资助:
    教育部产学合作协同育人项目(220700595294923)

Abstract:

In order to allocate limited network resources in cloud computing efficiently to ensure Quality of Service (QoS) while improving resource utilization and management efficiency, a security and reliability driven Encoder-Decoder-based Deep Reinforcement Learning (ED-DRL) method was proposed for Service Function Chain (SFC) deployment. In the method, the SFC deployment was regarded as a Markov Decision Process (MDP), a Graph ATtention network (GAT) encoder and a Gated Recurrent Unit (GRU) decoder were employed to extract network topology features and inter-node dependencies effectively, and an Asynchronous Advantage Actor-Critic (A3C) algorithm was combined to generate SFC deployment strategies dynamically. To address security and reliability requirements, the reward function was redesigned to guide the policy network in selecting optimal resources. Simulation results demonstrate that ED-DRL achieves an acceptance rate of 70.7% and an average revenue of 0.063 5, outperforming comparison methods such as Continuous-Decision scheme relying on Reinforcement Learning (CDRL).

Key words: Service Function Chain (SFC) deployment, reinforcement learning, Markov Decision Process (MDP), Asynchronous Advantage Actor-Critic (A3C), Graph ATtention network (GAT), Gated Recurrent Unit (GRU)

摘要:

为了在云计算中高效分配有限的网络资源以确保服务质量(QoS),并且同时提高资源利用率和管理效率,提出一种安全可靠性驱动的基于编解码的深度强化学习(ED-DRL)方法用于服务功能链(SFC)部署。该方法将SFC部署看作一个马尔可夫决策过程(MDP),采用图注意力网络(GAT)编码器和门控循环单元(GRU)解码器高效提取网络拓扑特征和节点间的依赖关系,并结合异步优势Actor-Critic(A3C)算法实现SFC部署策略的动态生成。针对安全可靠性的需求,重设计奖励函数,从而引导策略网络选择最优资源。仿真结果表明,ED-DRL能获得70.7%的接受率与0.063 5的平均收益,优于连续决策强化学习(CDRL)等对比方法。

关键词: 服务功能链部署, 强化学习, 马尔可夫决策过程, 异步优势Actor-Critic, 图注意力网络, 门控循环单元

CLC Number: