Service integration method based on adaptive multi‑objective reinforcement learning

doi:10.11772/j.issn.1001-9081.2021122041

Journal of Computer Applications ›› 2022, Vol. 42 ›› Issue (11): 3500-3505.DOI: 10.11772/j.issn.1001-9081.2021122041

• ChinaService 2021 • Previous Articles

Service integration method based on adaptive multi‑objective reinforcement learning

Xiao GUO, Chunshan LI(), Yuyue ZHANG, Dianhui CHU

School of Computer Science and Technology，Harbin Institute of Technology （Weihai），Weihai Shandong 264209，China

Received:2021-12-06 Revised:2021-12-29 Accepted:2022-01-13 Online:2022-03-02 Published:2022-11-10
Contact: Chunshan LI
About author:GUO Xiao， born in 1999， M. S. His research interests include service computing， knowledge engineering.
LI Chunshan， born in 1984， Ph. D.， professor. His research interests include service computing， knowledge engineering.
ZHANG Yuyue， born in 2000. His research interests include knowledge engineering.
CHU Dianhui， born in 1970， Ph. D.， professor. His research interests include service computing， intelligent manufacturing.
Supported by:
National Key Research and Development Program of China(2018YFB1402500);National Natural Science Foundation of China(61902090);Natural Science Foundation of Shandong Province(ZR2020KF019)

基于自适应多目标强化学习的服务集成方法

郭潇, 李春山(), 张宇跃, 初佃辉

哈尔滨工业大学（威海）计算机科学与技术学院，山东威海 264209

通讯作者: 李春山
作者简介:郭潇（1999—），男，黑龙江伊春人，硕士，主要研究方向：服务计算、知识工程
李春山（1984—），男，山西吕梁人，副教授，博士，CCF会员，主要研究方向：服务计算、知识工程 lics@hit.edu.cn
张宇跃（2000—），男，江西南昌人，主要研究方向：知识工程
初佃辉（1970—），男，山东潍坊人，教授，博士，CCF高级会员，主要研究方向：服务计算、智慧制造。
基金资助:
国家重点研发计划项目(2018YFB1402500);国家自然科学基金资助项目(61902090);山东省自然科学基金资助项目(ZR2020KF019)

Abstract

Abstract:

The current service resources in Internet of Services （IoS） show a trend of refinement and specialization. Services with single function cannot meet the complex and changeable requirements of users. Service integrating and scheduling methods have become hot spots in the field of service computing. However， most existing service integrating and scheduling methods only consider the satisfaction of user requirements and do not consider the sustainability of the IoS ecosystem. In response to the above problems， a service integration method based on adaptive multi?objective reinforcement learning was proposed. In this method， a multi?objective optimization strategy was introduced into the framework of Asynchronous Advantage Actor?Critic （A3C） algorithm， so as to ensure the healthy development of the IoS ecosystem while satisfying user needs. The integrated weight of the multi?objective value was able to adjusted dynamically according to the regret value， which improved the imbalance of sub?objective values in multi?objective reinforcement learning. The service integration verification was carried out in a real large?scale service environment. Experimental results show that the proposed method is faster than traditional machine learning methods in large?scale service environment， and has a more balanced solution quality of each objective compared with Reinforcement Learning （RL） with fixed weights.

Key words: service integration, Reinforcement Learning (RL), Asynchronous Advantage Actor?Critic (A3C) algorithm, multi?objective optimization, adaptive weight

摘要：

当前服务互联网（IoS）中的服务资源呈现精细化、专业化的趋势，功能单一的服务无法满足用户复杂多变的需求，服务集成调度方法已经成为服务计算领域的热点。现有的服务集成调度方法大都只考虑用户需求的满足，未考虑IoS生态系统的可持续性。针对上述问题，提出一种基于自适应多目标强化学习的服务集成方法，该方法在异步优势演员评论家（A3C）算法的框架下引入多目标优化策略，从而在满足用户需求的同时保证IoS生态系统的健康发展。所提方法可以根据遗憾值对多目标值集成权重进行动态调整，改善多目标强化学习中子目标值不平衡的现象。在真实大规模服务环境下进行了服务集成验证，实验结果表明所提方法相对于传统机器学习方法在大规模服务环境下求解速度更快；相较于权重固定的强化学习（RL），各目标的求解质量更均衡。

关键词: 服务集成, 强化学习, 异步优势演员评论家算法, 多目标优化, 自适应权重

CLC Number:

TP315

Xiao GUO, Chunshan LI, Yuyue ZHANG, Dianhui CHU. Service integration method based on adaptive multi‑objective reinforcement learning[J]. Journal of Computer Applications, 2022, 42(11): 3500-3505.

郭潇, 李春山, 张宇跃, 初佃辉. 基于自适应多目标强化学习的服务集成方法[J]. 《计算机应用》唯一官方网站, 2022, 42(11): 3500-3505.

Figures/Tables 5

Fig. 1 AC model framework

Tab. 1 Parameter settings of ant colony algorithm and reinforcement learning algorithm

变量名	变量描述	变量值
size_pop	蚂蚁数量	20
rho	信息素挥发速度	0.1
UPDATE_GLOBAL_ITER	强化学习全局网络更频率	5
GAMMA	$ε ‑ g r e e d y$ 参数	0.9
THREAD_CNT	强化学习线程数	4

Tab. 1 Parameter settings of ant colony algorithm and reinforcement learning algorithm

变量名	变量描述	变量值
size_pop	蚂蚁数量	20
rho	信息素挥发速度	0.1
UPDATE_GLOBAL_ITER	强化学习全局网络更频率	5
GAMMA	$ε ‑ g r e e d y$ 参数	0.9
THREAD_CNT	强化学习线程数	4

Fig. 2 Total objective value?iteration number graphs of three algorithms

Fig. 3 Total objective value?time graphs of three algorithms

Fig. 4 Sub?objective value?iteration number graphs of three algorithms

References 24

1	FLETCHER K K. A quality‑based web api selection for mashup development using affinity propagation［C］// Proceedings of the 2018 International Conference on Services Computing. Cham： Springer， 2018： 153-165. 10.1007/978-3-319-94376-3_10
2	ALMARIMI N， OUNI A， BOUKTIF S， et al. Web service API recommendation for automated mashup creation using multi‑ objective evolutionary search［J］. Applied Soft Computing， 2019， 85： No.105830. 10.1016/j.asoc.2019.105830
3	张龙昌，张成文.混合QoS聚类的服务组合［J］.北京邮电大学学报，2011，34（5）：57-62. 10.3969/j.issn.1007-5321.2011.05.013
	ZHANG L C， ZHANG C W. Hybrid QoS‑clustering web service composition［J］. Journal of Beijing University of Posts and Telecommunications， 2011， 34（5）： 57-62. 10.3969/j.issn.1007-5321.2011.05.013
4	朱志良，苑海涛，宋杰，等. Web服务聚类方法的研究和改进［J］. 小型微型计算机系统， 2012， 33（1）：96-101. 10.3969/j.issn.1000-1220.2012.01.018
	ZHU Z L， YUAN H T， SONG J， et al. Study and improvement on web services clustering approach［J］. Journal of Chinese Computer Systems， 2012， 33（1）： 96-101. 10.3969/j.issn.1000-1220.2012.01.018
5	TRIPATHY A K， PATRA M R， KHAN M A， et al. Dynamic web service composition with QoS clustering［C］// Proceedings of the 2014 IEEE International Conference on Web Services. Piscataway： IEEE， 2014： 678-679. 10.1109/icws.2014.99
6	WU L， ZHANG Y， DI Z Y. A service‑cluster based approach to service substitution of web service composition［C］// Proceedings of the IEEE 16th International Conference on Computer Supported Cooperative Work in Design. Piscataway： IEEE， 2012： 564-568. 10.1109/cscwd.2012.6221874
7	ABDULLAH A， LI X N. An efficient I/O based clustering HTN in Web Service Composition［C］// Proceedings of the 2013 International Conference on Computing， Management and Telecommunications. Piscataway： IEEE， 2013： 252-257. 10.1109/commantel.2013.6482400
8	CAI H H， CUI L Z. Cloud service composition based on multi‑ granularity clustering［J］. Journal of Algorithms and Computational Technology， 2014， 8（2）： 143-161. 10.1260/1748-3018.8.2.143
9	BIANCHINI D， DE ANTONELLIS V， MELCHIORI M. An ontology‑based method for classifying and searching e‑Services［C］// Proceedings of the Forum of First International Conference on Service Oriented Computing， LNCS 2910. Cham： Springer， 2003： 15-18.
10	WANG X Z， WANG Z J， XU X F. Semi‑empirical service composition： a clustering based approach［C］// Proceedings of the 2011 IEEE International Conference on Web Services. Piscataway： IEEE， 2011： 219-226. 10.1109/icws.2011.15
11	QUAN L， WANG Z L， LIU X. A real‑time subtask‑assistance strategy for adaptive services composition［J］. IEICE Transactions on Information and Systems， 2018， E101.D（5）： 1361-1369. 10.1587/transinf.2017edp7131
12	GAO A Q， YANG D Q， TANG S W， et al. Web service composition using Markov decision processes［C］// Proceedings of the 2005 International Conference on Web‑Age Information Management， LNCS 3739. Berlin： Springer， 2005： 308-319.
13	ZHANG Y Z， CLAVERA I， TSAI B， et al. Asynchronous methods for model‑based reinforcement learning［C］// Proceedings of the 3rd Conference on Robot Learning. New York： JMLR.org， 2020： 1338-1347.
14	RUIZ‑MONTIEL M， MANDOW L， PÉREZ‑DE‑LA‑CRUZ J L. A temporal difference method for multi‑objective reinforcement learning［J］. Neurocomputing， 2017， 263： 15-25. 10.1016/j.neucom.2016.10.100
15	IANSITI M， LEVIEN R. Strategy as ecology［J］. Harvard Business Review， 2004， 82（3）： 68-78， 126.
16	QI Q， CAO J. Investigating the evolution of Web API cooperative communities in the mashup ecosystem［C］// Proceedings of the 2020 IEEE International Conference on Web Services. Piscataway： IEEE， 2020： 413-417. 10.1109/icws49710.2020.00060
17	WATTS D J， STROGATZ S H. Collective dynamics of ‘small‑world’ networks［J］. Nature， 1998， 393（6684）： 440-442. 10.1038/30918
18	NEWMAN M， BARABÁSI A L， WATTS D J. The Structure and Dynamics of Networks［M］. Princeton， NJ： Princeton University Press， 2006： 304-308.
19	BARABÁSI A L， ALBERT R. Emergence of scaling in random networks［J］. Science， 1999， 286（5439）： 509-512. 10.1126/science.286.5439.509
20	VÁZQUEZ A， PASTOR‑SATORRAS R， VESPIGNANI A. Internet topology at the router and autonomous system level［EB/OL］. ［2021-12-05］.. 10.1103/physreve.65.066130
21	NEWMAN M E J. Scientific collaboration networks. Ⅰ. Network construction and fundamental results［J］. Physical Review E， Statistical， Nonlinear， and Soft Matter Physics， 2001， 64（1）： No.016131. 10.1103/physreve.64.016131
22	FOSTER D P， YOUNG H P. Regret testing： a simple payoff‑ based procedure for learning Nash equilibrium［D］. Baltimore， MD： University of Pennsylvania， 2003： 341-367. 10.1016/s0899-8256(03)00025-3
23	HART S， MAS‑COLELL A. A reinforcement procedure leading to correlated equilibrium［M］// Economics Essays： A Festschrift for Werner Hildenbrand. Berlin： Springer， 2001： 181-200. 10.1007/978-3-662-04623-4_12
24	ORTNER R. Regret bounds for reinforcement learning via Markov chain concentration［J］. Journal of Artificial Intelligence Research， 2020， 67： 115-128. 10.1613/jair.1.11316

[1]	Shiquan DENG, Xuguo YE. Multi-objective task offloading algorithm based on deep Q-network [J]. Journal of Computer Applications, 2022, 42(6): 1668-1674.
[2]	Weikang ZHANG, Sheng LIU, Qian HUANG, Yuxin GUO. Equilibrium optimizer considering distance factor and elite evolutionary strategy [J]. Journal of Computer Applications, 2022, 42(6): 1844-1851.
[3]	Xueming LI, Guohao WU, Shangbo ZHOU, Xiaoran LIN, Hongbin XIE. Image instance segmentation model based on fractional-order network and reinforcement learning [J]. Journal of Computer Applications, 2022, 42(2): 574-583.
[4]	WANG Bo, LIU Liansheng, HAN Shaocheng, ZHU Shixing. Hybrid multi-objective grasshopper optimization algorithm based on fusion of multiple strategies [J]. Journal of Computer Applications, 2020, 40(9): 2670-2676.
[5]	ZHANG Xuyuan, WANG Yan. Adaptive intensity fitting model for segmentation of images with intensity inhomogeneity [J]. Journal of Computer Applications, 2019, 39(9): 2719-2725.
[6]	XIE Yonghua, HAN Liping. Local binary pattern based on dominant gradient encoding for pollen image recognition [J]. Journal of Computer Applications, 2018, 38(6): 1765-1770.
[7]	ZHANG Xinglong, LI Songli, XIAO Junchao. Service integration-oriented workflow model and implementation method [J]. Journal of Computer Applications, 2015, 35(7): 1993-1998.
[8]	SHI Li XU Xiaohui CHEN Liwei. Adaptive non-local denoising of magnetic resonance images based on normalized cross correlation [J]. Journal of Computer Applications, 2014, 34(12): 3609-3613.
[9]	. Discussion of BPEL4WS-based Web service integration [J]. Journal of Computer Applications, 2007, 27(11): 2733-2735.
[10]	LIU Yuan,YIN Dong,CHEN Xin,YAO Ting. Self-adaptive fusion method for remote sensing images based on regional performance [J]. Journal of Computer Applications, 2005, 25(11): 2595-2597.
[11]	LIU Hui,XIA Han-zhu,LIU Xiang. An adaptive queue scheduling mechanism for supporting Diffserv [J]. Journal of Computer Applications, 2005, 25(04): 886-888.

Service integration method based on adaptive multi‑objective reinforcement learning

基于自适应多目标强化学习的服务集成方法

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 5

References 24

Related Articles 11

Recommended Articles

Metrics