Journal of Computer Applications ›› 2022, Vol. 42 ›› Issue (11): 3493-3499.DOI: 10.11772/j.issn.1001-9081.2021101735

• ChinaService 2021 • Previous Articles    

Event‑driven dynamic collection method for microservice invocation link data

Peng LI1,2(), Zhuofeng ZHAO1,2, Han LI1,2   

  1. 1.School of Information Science and Technology,North China University of Technology,Beijing 100144,China
    2.Beijing Key Laboratory on Integration and Analysis of Large?Scale Stream Data (North China University of Technology),Beijing 100144,China
  • Received:2021-10-09 Revised:2021-11-08 Accepted:2021-11-17 Online:2022-01-24 Published:2022-11-10
  • Contact: Peng LI
  • About author:LI Peng, born in 1996, M. S. His research interests include microservices, big data.
    ZHAO Zhuofeng, born in 1977, Ph. D., research fellow. His research interests include cloud computing, massive perceptual data processing, service computing, smart city.
    LI Han, born in 1981, Ph. D., associate research fellow. Her research interests include big data analysis, data quality management.
  • Supported by:
    National Key Research and Development Program of China(2019YFB1405100);Beijing Natural Science Foundation(4202021)

事件驱动的微服务调用链路数据动态采集方法

李鹏1,2(), 赵卓峰1,2, 李寒1,2   

  1. 1.北方工业大学 信息学院,北京 100144
    2.大规模流数据集成与分析技术北京市重点实验室(北方工业大学),北京 100144
  • 通讯作者: 李鹏
  • 作者简介:李鹏(1996—),男,山东济南人,硕士,CCF会员,主要研究方向:微服务、大数据 lipeng_2@126.com
    赵卓峰(1977—),男,山东济南人,研究员,博士,CCF会员,主要研究方向:云计算、海量感知数据处理、服务计算、智慧城市
    李寒(1981—),女,辽宁沈阳人,副研究员,博士,CCF会员,主要研究方向:大数据分析、数据质量管理。
  • 基金资助:
    国家重点研发计划项目(2019YFB1405100);北京市自然科学基金资助项目(4202021)

Abstract:

Microservice invocation link data is a type of important data generated in the daily operation of the microservice application system, which records a series of service invocation information corresponding to a user request in the microservice application in the form of link. Microservice invocation link data are generated at different microservice deployment nodes due to the distribution characteristic of the system, and the current collection methods for these distributed data include full collection and sampling collection. Full collection may bring large data transmission and data storage costs, while sampling collection may miss critical invocation data. Therefore, an event?driven and pipeline sampling based dynamic collection method for microservice invocation link data was proposed, and a microservice invocation link system that supports dynamic collection of invocation link data was designed and implemented based on the open?source software Zipkin. Firstly, the pipeline sampling was performed on the link data of different nodes that met the predefined event features, that is the same link data of all nodes were collected by the data collection server only when the event defined data was generated by a node; meanwhile, to address the problem of inconsistent data generation rates of different nodes, multi?threaded streaming data processing technology based on time window and data synchronization technology were used to realize the data collection and transmission of different nodes. Finally, considering the problem that the link data of each node arrives at the server in different sequential order, the synchronization and summary of the full link data were realized through the timing alignment method. Experimental results on the public microservice lrevocation ink dataset prove that compared to the full collection and sampling collection methods, the proposed method has higher accuracy and more efficient collection on link data containing specific events such as anomalies and slow responces.

Key words: microservice, invocation link data, dynamic sampling, event matching, caching mechanism, service link tracing

摘要:

微服务调用链路数据是微服务应用系统日常运行中产生的一类重要数据,它以链路形式记录了微服务应用中一次用户请求对应的一系列服务调用信息。由于系统的分布性,微服务调用链路数据产生在不同的微服务部署节点,当前对这些分布数据的采集一般采用全量采集和采样采集两种方法。全量采集会产生较大数据传输和数据存储等成本,而采样采集则可能会漏掉关键的链路数据。因此,提出一种基于事件驱动和流水线采样的微服务调用链路数据动态采集方法,并基于开源软件Zipkin设计实现了一个微服务调用链路数据动态采集系统。该系统首先对不同节点符合预定义事件特征的链路数据进行流水线采样,即数据采集服务端只在某节点产生事件定义的数据时对所有节点采集同一链路数据;同时,针对不同节点的数据产生速率不一致问题,采用基于时间窗口的多线程流式数据处理和数据同步技术实现不同节点的数据采集和传递;最后,针对各节点链路数据到达服务端先后顺序不一的问题,通过时序对齐方式进行全链路数据的同步和汇总。在公开的微服务调用链路数据集上的实验结果表明,相较于全量采集和采样采集方法,所提方法对于包含异常、慢响应等特定事件的链路数据具有采集准确性高、效率好的效果。

关键词: 微服务, 调用链路数据, 动态采样, 事件匹配, 缓存机制, 服务链路追踪

CLC Number: