Query schedule and load shedding model in data stream system

Abstract

Abstract: It is one of the major tasks to execute query timely with less performance and precise loss in a data stream system when the system resource is limited. This paper solved this problem from two aspects including optimizing operator schedule and performing load shedding. Taking different operators features into consideration, a scheduling strategy based on operator priority was presented, which comprehensively considered the factors related to the operators and the system running state. In order to dynamically modify the operator priority, the artificial neural network learning algorithm was also introduced, which can modify operator priority according to the system performance. Aiming to solve the potential overload problems caused by the uncertainty of the arrived data in a data stream management system, the load shedding issue of the data stream system was researched. Concerning the query of the two streams joint operators, a semantic-based load shedding technique was applied. A data stream load shedding model was designed and implemented, which solved four problems including load shedding and anti-shedding time, amount, location and predicate. The experiment result was analyzed, which showed that the load shedding model presented can effectively avoid the low processing efficiency when system is in the state of overload, and guarantee the coordination of arrived data and system processing capability.

Key words: data stream, query, schedule, priority, load shedding

摘要： 如何在资源有限的情况下，快速执行查询处理并最大限度地减少查询精度的损失是数据流查询处理的主要任务之一。从操作符的优化调度和负载脱落两个方面研究了这一问题。分析了影响操作符调度的主要因素，结合操作符对不同元组的不同处理特性以及系统运行状态，设计并实现了一个基于优先级的调度模型。其中采用人工神经元网络中的算法对影响操作符优先级的权重系数进行训练，实现了基于动态优先级的调度。使用负载脱落技术可以使系统在大量突发数据流元组进入系统而系统无法处理时及时脱落其中的部分数据，维持系统的正常运转，提高系统查询处理的可用性。针对存在两个数据流连接操作符的查询请求，研究了负载脱落和反脱落的时机、数量、位置、谓词等问题，设计并实现了一个基于语义的负载脱落模型。算法和模型的运行结果表明该模型在过载时系统能够及时降载，在欠载时能及时进行反脱落操作，减少了性能的损失。

关键词: 数据流, 查询, 调度, 优先级, 负载脱落

王丹李茂增. 数据流系统中的一种查询调度及负载脱落模型[J]. 计算机应用, 2009, 29(10): 2766-2771.

[1]	Xin LI, Liyong BAO, Hongwei DING, Zheng GUAN. MAC layer scheduling strategy of roadside units based on MEC server priority service [J]. Journal of Computer Applications, 2024, 44(4): 1227-1235.
[2]	Yunyun GAO, Lasheng ZHAO, Qiang ZHANG. Acoustic word embedding model based on Bi-LSTM and convolutional-Transformer [J]. Journal of Computer Applications, 2024, 44(1): 123-128.
[3]	Jinhui LAI, Zichen XU, Yicheng TU, Guolong TAN. OmegaDB： concurrent computing framework of relational operators for heterogeneous architecture [J]. Journal of Computer Applications, 2023, 43(7): 2017-2025.
[4]	Fangshu CHEN, Wei ZHANG, Xiaoming HU, Yufei ZHANG, Xiankai MENG, Linxiang SHI. Dynamic aggregate nearest neighbor query algorithm in weighted road network space [J]. Journal of Computer Applications, 2023, 43(7): 2026-2033.
[5]	Libin CEN, Jingdong LI, Chunbo LIN, Xiaoling WANG. Approximate query processing approach based on deep autoregressive model [J]. Journal of Computer Applications, 2023, 43(7): 2034-2039.
[6]	Dongliang MU, Meng HAN, Ang LI, Shujuan LIU, Zhihui GAO. Overview of classification methods for complex data streams with concept drift [J]. Journal of Computer Applications, 2023, 43(6): 1664-1675.
[7]	Wenhao HU, Jing LUO, Xinhui TU. Pseudo relevance feedback method for dense retrieval [J]. Journal of Computer Applications, 2023, 43(4): 1036-1042.
[8]	Zhiqiang CHEN, Meng HAN, Hongxin WU, Muhang LI, Xilong ZHANG. Multi-stage weighted concept drift detection method [J]. Journal of Computer Applications, 2023, 43(3): 776-784.
[9]	Tao QIU, Jianli DING, Xiufeng XIA, Hongmei XI, Peiliang XIE, Qingyi ZHOU. Efficient complex event matching algorithm based on ordered event lists [J]. Journal of Computer Applications, 2023, 43(2): 423-429.
[10]	Yu HONG, Hongchang CHEN, Jianpeng ZHANG, Ruiyang HUANG. Graph summarization algorithm based on node similarity grouping and graph compression [J]. Journal of Computer Applications, 2023, 43(10): 3047-3053.
[11]	MA Yongqiang, CHEN Xiaomeng, YU Ziqiang. Range query algorithm for large scale moving objects in distributed environment [J]. Journal of Computer Applications, 2023, 43(1): 111-121.
[12]	Wendi HUA, Yuan GAO, Meng LYU, Ping XIE. Research on Bloom filter： a survey [J]. Journal of Computer Applications, 2022, 42(6): 1729-1747.
[13]	Le WANG, Meng HAN, Xiaojuan LI, Ni ZHANG, Haodong CHENG. Ensemble classification algorithm based on dynamic weighting function [J]. Journal of Computer Applications, 2022, 42(4): 1137-1147.
[14]	Runze LI, Xuejiao SUN. Data stream preference query based on extraction sequence according to temporal condition [J]. Journal of Computer Applications, 2022, 42(3): 724-730.
[15]	Min WANG, Tingting FENG, Fan MIN, Hongming TANG, Jianping YAN, Jijia LIAO. Multi-label active learning algorithm for shale gas reservoir prediction [J]. Journal of Computer Applications, 2022, 42(2): 646-654.

Query schedule and load shedding model in data stream system

数据流系统中的一种查询调度及负载脱落模型

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics