Journal of Computer Applications ›› 2022, Vol. 42 ›› Issue (11): 3500-3505.DOI: 10.11772/j.issn.1001-9081.2021122041

• ChinaService 2021 • Previous Articles    

Service integration method based on adaptive multi‑objective reinforcement learning

Xiao GUO, Chunshan LI(), Yuyue ZHANG, Dianhui CHU   

  1. School of Computer Science and Technology,Harbin Institute of Technology (Weihai),Weihai Shandong 264209,China
  • Received:2021-12-06 Revised:2021-12-29 Accepted:2022-01-13 Online:2022-03-02 Published:2022-11-10
  • Contact: Chunshan LI
  • About author:GUO Xiao, born in 1999, M. S. His research interests include service computing, knowledge engineering.
    LI Chunshan, born in 1984, Ph. D., professor. His research interests include service computing, knowledge engineering.
    ZHANG Yuyue, born in 2000. His research interests include knowledge engineering.
    CHU Dianhui, born in 1970, Ph. D., professor. His research interests include service computing, intelligent manufacturing.
  • Supported by:
    National Key Research and Development Program of China(2018YFB1402500);National Natural Science Foundation of China(61902090);Natural Science Foundation of Shandong Province(ZR2020KF019)

基于自适应多目标强化学习的服务集成方法

郭潇, 李春山(), 张宇跃, 初佃辉   

  1. 哈尔滨工业大学(威海) 计算机科学与技术学院,山东 威海 264209
  • 通讯作者: 李春山
  • 作者简介:郭潇(1999—),男,黑龙江伊春人,硕士,主要研究方向:服务计算、知识工程
    李春山(1984—),男,山西吕梁人,副教授,博士,CCF会员,主要研究方向:服务计算、知识工程 lics@hit.edu.cn
    张宇跃(2000—),男,江西南昌人,主要研究方向:知识工程
    初佃辉(1970—),男,山东潍坊人,教授,博士,CCF高级会员,主要研究方向:服务计算、智慧制造。
  • 基金资助:
    国家重点研发计划项目(2018YFB1402500);国家自然科学基金资助项目(61902090);山东省自然科学基金资助项目(ZR2020KF019)

Abstract:

The current service resources in Internet of Services (IoS) show a trend of refinement and specialization. Services with single function cannot meet the complex and changeable requirements of users. Service integrating and scheduling methods have become hot spots in the field of service computing. However, most existing service integrating and scheduling methods only consider the satisfaction of user requirements and do not consider the sustainability of the IoS ecosystem. In response to the above problems, a service integration method based on adaptive multi?objective reinforcement learning was proposed. In this method, a multi?objective optimization strategy was introduced into the framework of Asynchronous Advantage Actor?Critic (A3C) algorithm, so as to ensure the healthy development of the IoS ecosystem while satisfying user needs. The integrated weight of the multi?objective value was able to adjusted dynamically according to the regret value, which improved the imbalance of sub?objective values in multi?objective reinforcement learning. The service integration verification was carried out in a real large?scale service environment. Experimental results show that the proposed method is faster than traditional machine learning methods in large?scale service environment, and has a more balanced solution quality of each objective compared with Reinforcement Learning (RL) with fixed weights.

Key words: service integration, Reinforcement Learning (RL), Asynchronous Advantage Actor?Critic (A3C) algorithm, multi?objective optimization, adaptive weight

摘要:

当前服务互联网(IoS)中的服务资源呈现精细化、专业化的趋势,功能单一的服务无法满足用户复杂多变的需求,服务集成调度方法已经成为服务计算领域的热点。现有的服务集成调度方法大都只考虑用户需求的满足,未考虑IoS生态系统的可持续性。针对上述问题,提出一种基于自适应多目标强化学习的服务集成方法,该方法在异步优势演员评论家(A3C)算法的框架下引入多目标优化策略,从而在满足用户需求的同时保证IoS生态系统的健康发展。所提方法可以根据遗憾值对多目标值集成权重进行动态调整,改善多目标强化学习中子目标值不平衡的现象。在真实大规模服务环境下进行了服务集成验证,实验结果表明所提方法相对于传统机器学习方法在大规模服务环境下求解速度更快;相较于权重固定的强化学习(RL),各目标的求解质量更均衡。

关键词: 服务集成, 强化学习, 异步优势演员评论家算法, 多目标优化, 自适应权重

CLC Number: