《计算机应用》唯一官方网站 ›› 2024, Vol. 44 ›› Issue (10): 3167-3176.DOI: 10.11772/j.issn.1001-9081.2023101460
收稿日期:
2023-10-27
修回日期:
2023-12-05
接受日期:
2023-12-15
发布日期:
2024-10-15
出版日期:
2024-10-10
通讯作者:
李小龙
作者简介:
黄华(1981—),男,湖南衡阳人,副教授,博士,CCF会员,主要研究方向:云计算、服务计算、人工智能、区块链基金资助:
Hua HUANG1, Ziyi YANG1, Xiaolong LI1,2(), Chuang LI1
Received:
2023-10-27
Revised:
2023-12-05
Accepted:
2023-12-15
Online:
2024-10-15
Published:
2024-10-10
Contact:
Xiaolong LI
About author:
HUANG Hua, born in 1981, Ph.D., associate professor. His research interests include cloud computing, service computing, artificial intelligence, blockchain.Supported by:
摘要:
为解决现有的业务流程监控(BPM)方法的模型精度随时间下降和实时性较差的问题,提出一种基于概念漂移的预测性业务流程监控(PPM)方法。首先,对事件日志数据进行预处理及编码;其次,利用双向长短时记忆(Bi-LSTM)网络模型从前后方向捕获足够的序列信息以构建业务流程模型,并利用注意力机制充分考虑不同事件对预测结果的贡献程度,赋予事件日志不同的权重,从而减少噪声对预测结果的影响;最后,将正在执行的实例输入构建的模型,得到预测的执行结果,并将这些结果作为历史数据对模型微调。在8个公开且真实的数据集上的测试结果表明,所提方法的平均预测准确率相较于支持向量机(SVM)、逻辑回归(LR)和随机森林(RF)等已有的BPM方法提升了5.4%~23.8%,且早期性和时间性能都优于现有的研究方法。
中图分类号:
黄华, 杨子仪, 李小龙, 李闯. 基于概念漂移的预测性业务流程监控方法[J]. 计算机应用, 2024, 44(10): 3167-3176.
Hua HUANG, Ziyi YANG, Xiaolong LI, Chuang LI. Predictive business process monitoring method based on concept drift[J]. Journal of Computer Applications, 2024, 44(10): 3167-3176.
事件日志 | 案例数 | 事件数 | 活动数 | 案例属性数 |
---|---|---|---|---|
许可 | 7 065 | 86 581 | 51 | 168 |
国际申报 | 6 449 | 72 151 | 34 | 18 |
国内申报 | 10 500 | 56 437 | 17 | 5 |
预付差旅费用 | 2 099 | 18 246 | 29 | 17 |
请求付款 | 6 886 | 36 796 | 19 | 9 |
表1 不同日志的元数据
Tab. 1 Metadata for different logs
事件日志 | 案例数 | 事件数 | 活动数 | 案例属性数 |
---|---|---|---|---|
许可 | 7 065 | 86 581 | 51 | 168 |
国际申报 | 6 449 | 72 151 | 34 | 18 |
国内申报 | 10 500 | 56 437 | 17 | 5 |
预付差旅费用 | 2 099 | 18 246 | 29 | 17 |
请求付款 | 6 886 | 36 796 | 19 | 9 |
事件日志 | 日志解析 |
---|---|
Declaration APPROVED by ADMINISTRATION | Declaration APPROVED |
Declaration FINAL APPROVED by DIRECTOR | Declaration FINAL APPROVED |
Declaration FOR APPROVAL by ADMINISTRATION | Declaration FOR APPROVAL |
Declaration REJECTED by ADMINISTRATION | Declaration REJECTED |
表2 事件日志解析示例
Tab. 2 Examples of event log parsing
事件日志 | 日志解析 |
---|---|
Declaration APPROVED by ADMINISTRATION | Declaration APPROVED |
Declaration FINAL APPROVED by DIRECTOR | Declaration FINAL APPROVED |
Declaration FOR APPROVAL by ADMINISTRATION | Declaration FOR APPROVAL |
Declaration REJECTED by ADMINISTRATION | Declaration REJECTED |
事件日志 | 活动数 | 资源数 | 轨迹数 | 事件数 | 轨迹变化数 |
---|---|---|---|---|---|
BPI20T | 51 | 2 | 86 581 | 7 065 | 1 478 |
BPI20I | 34 | 2 | 72 151 | 6 449 | 753 |
BPI20D | 17 | 2 | 56 437 | 5 | 10 500 |
BPI20P | 29 | 2 | 18 246 | 2 099 | 202 |
BPI20R | 19 | 2 | 36 796 | 6 886 | 89 |
BPI12L | 36 | 69 | 262 200 | 13 087 | 4 366 |
表3 实验所用事件日志的特征
Tab. 3 Characteristics of event logs used in experiments
事件日志 | 活动数 | 资源数 | 轨迹数 | 事件数 | 轨迹变化数 |
---|---|---|---|---|---|
BPI20T | 51 | 2 | 86 581 | 7 065 | 1 478 |
BPI20I | 34 | 2 | 72 151 | 6 449 | 753 |
BPI20D | 17 | 2 | 56 437 | 5 | 10 500 |
BPI20P | 29 | 2 | 18 246 | 2 099 | 202 |
BPI20R | 19 | 2 | 36 796 | 6 886 | 89 |
BPI12L | 36 | 69 | 262 200 | 13 087 | 4 366 |
事件日志 | SVM | LR | RF | LSTM | Bi-LSTM | Att-Bi-LSTM |
---|---|---|---|---|---|---|
平均值 | 0.67 | 0.66 | 0.63 | 0.69 | 0.74 | 0.78 |
BPI20T | 0.58 | 0.56 | 0.54 | 0.62 | 0.63 | 0.79 |
BPI20I | 0.72 | 0.69 | 0.73 | 0.77 | 0.80 | 0.83 |
BPI20D | 0.77 | 0.73 | 0.67 | 0.79 | 0.83 | 0.85 |
BPI20P | 0.80 | 0.79 | 0.77 | 0.83 | 0.85 | 0.88 |
BPI20R | 0.63 | 0.60 | 0.58 | 0.66 | 0.71 | 0.72 |
BPI12A | 0.64 | 0.61 | 0.58 | 0.60 | 0.70 | 0.73 |
BPI12J | 0.60 | 0.59 | 0.53 | 0.62 | 0.65 | 0.68 |
BPI12C | 0.64 | 0.67 | 0.60 | 0.65 | 0.71 | 0.72 |
表4 不同流程结果预测模型在各数据集上的总体AUC比较
Tab. 4 Comparison of overall AUC of different process result prediction models on different datasets
事件日志 | SVM | LR | RF | LSTM | Bi-LSTM | Att-Bi-LSTM |
---|---|---|---|---|---|---|
平均值 | 0.67 | 0.66 | 0.63 | 0.69 | 0.74 | 0.78 |
BPI20T | 0.58 | 0.56 | 0.54 | 0.62 | 0.63 | 0.79 |
BPI20I | 0.72 | 0.69 | 0.73 | 0.77 | 0.80 | 0.83 |
BPI20D | 0.77 | 0.73 | 0.67 | 0.79 | 0.83 | 0.85 |
BPI20P | 0.80 | 0.79 | 0.77 | 0.83 | 0.85 | 0.88 |
BPI20R | 0.63 | 0.60 | 0.58 | 0.66 | 0.71 | 0.72 |
BPI12A | 0.64 | 0.61 | 0.58 | 0.60 | 0.70 | 0.73 |
BPI12J | 0.60 | 0.59 | 0.53 | 0.62 | 0.65 | 0.68 |
BPI12C | 0.64 | 0.67 | 0.60 | 0.65 | 0.71 | 0.72 |
事件日志 | SVM | LR | RF | LSTM | Bi-LSTM | Att-Bi-LSTM |
---|---|---|---|---|---|---|
平均值 | 23 | 23 | 23 | 23 | 23 | 20 |
BPI20R | 31 | 30 | 24 | 31 | 30 | 22 |
BPI12A | 17 | 17 | 16 | 17 | 17 | 16 |
BPI12J | 27 | 28 | 34 | 27 | 27 | 25 |
BPI12C | 16 | 15 | 16 | 16 | 16 | 15 |
表5 不同流程结果预测模型的早期性比较
Tab. 5 Earliness comparison of different process result prediction models
事件日志 | SVM | LR | RF | LSTM | Bi-LSTM | Att-Bi-LSTM |
---|---|---|---|---|---|---|
平均值 | 23 | 23 | 23 | 23 | 23 | 20 |
BPI20R | 31 | 30 | 24 | 31 | 30 | 22 |
BPI12A | 17 | 17 | 16 | 17 | 17 | 16 |
BPI12J | 27 | 28 | 34 | 27 | 27 | 25 |
BPI12C | 16 | 15 | 16 | 16 | 16 | 15 |
事件日志 | SVM | LR | RF | LSTM | Bi-LSTM | Att-Bi-LSTM |
---|---|---|---|---|---|---|
平均值 | 11 621 | 4 306 | 14 716 | 11 587 | 12 167 | 11 532 |
BPI20T | 23 385 | 9 066 | 28 943 | 23 014 | 24 821 | 21 605 |
BPI20I | 19 463 | 8 170 | 21 934 | 19 782 | 20 694 | 19 617 |
BPI20D | 483 | 57 | 74 | 163 | 107 | 736 |
BPI20P | 16 801 | 6 711 | 18 006 | 15 097 | 15 826 | 14 357 |
BPI20R | 21 859 | 8 539 | 28 412 | 25 493 | 27 845 | 21 375 |
BPI12A | 6 890 | 1 164 | 7 732 | 7 032 | 3 659 | 7 365 |
BPI12J | 1 907 | 545 | 7 954 | 974 | 974 | 1 237 |
BPI12C | 2 183 | 196 | 4 673 | 1 137 | 3 407 | 5 964 |
表6 不同模型在各数据集上训练的离线时间 (s)
Tab. 6 Offline time of training different models on each dataset
事件日志 | SVM | LR | RF | LSTM | Bi-LSTM | Att-Bi-LSTM |
---|---|---|---|---|---|---|
平均值 | 11 621 | 4 306 | 14 716 | 11 587 | 12 167 | 11 532 |
BPI20T | 23 385 | 9 066 | 28 943 | 23 014 | 24 821 | 21 605 |
BPI20I | 19 463 | 8 170 | 21 934 | 19 782 | 20 694 | 19 617 |
BPI20D | 483 | 57 | 74 | 163 | 107 | 736 |
BPI20P | 16 801 | 6 711 | 18 006 | 15 097 | 15 826 | 14 357 |
BPI20R | 21 859 | 8 539 | 28 412 | 25 493 | 27 845 | 21 375 |
BPI12A | 6 890 | 1 164 | 7 732 | 7 032 | 3 659 | 7 365 |
BPI12J | 1 907 | 545 | 7 954 | 974 | 974 | 1 237 |
BPI12C | 2 183 | 196 | 4 673 | 1 137 | 3 407 | 5 964 |
事件日志 | SVM | LR | RF | LSTM | Bi-LSTM | Att-Bi-LSTM |
---|---|---|---|---|---|---|
平均值 | 20 | 17 | 22 | 2 | 2 | 4 |
BPI20T | 21 | 15 | 23 | 2 | 3 | 3 |
BPI20I | 20 | 17 | 22 | 3 | 2 | 2 |
BPI20D | 37 | 32 | 41 | 1 | 1 | 2 |
BPI20P | 15 | 14 | 19 | 1 | 2 | 3 |
BPI20R | 30 | 26 | 34 | 2 | 3 | 5 |
BPI12A | 10 | 9 | 11 | 4 | 3 | 9 |
BPI12J | 12 | 13 | 12 | 1 | 1 | 3 |
BPI12C | 12 | 11 | 12 | 3 | 3 | 2 |
表7 不同模型在各数据集上训练的在线预测时间 (ms)
Tab. 7 Online prediction time of different models trained on each dataset
事件日志 | SVM | LR | RF | LSTM | Bi-LSTM | Att-Bi-LSTM |
---|---|---|---|---|---|---|
平均值 | 20 | 17 | 22 | 2 | 2 | 4 |
BPI20T | 21 | 15 | 23 | 2 | 3 | 3 |
BPI20I | 20 | 17 | 22 | 3 | 2 | 2 |
BPI20D | 37 | 32 | 41 | 1 | 1 | 2 |
BPI20P | 15 | 14 | 19 | 1 | 2 | 3 |
BPI20R | 30 | 26 | 34 | 2 | 3 | 5 |
BPI12A | 10 | 9 | 11 | 4 | 3 | 9 |
BPI12J | 12 | 13 | 12 | 1 | 1 | 3 |
BPI12C | 12 | 11 | 12 | 3 | 3 | 2 |
1 | VIRIYASITAVAT W, XU L D, BI Z, et al. Blockchain-based Business Process Management (BPM) framework for service composition in Industry 4.0[J]. Journal of Intelligent Manufacturing, 2020, 31(7): 1737-1748. |
2 | DIJKMAN R, TURETKEN O, VAN IJZENDOORN G R, et al. Business processes exceptions in relation to operational performance[J]. Business Process Management Journal, 2019, 25(5): 908-922. |
3 | KIR H, ERDOGAN N. A knowledge-intensive adaptive business process management framework[J]. Information Systems, 2021, 95: No.101639. |
4 | MÁRQUEZ-CHAMORRO A E, RESINAS M, RUIZ-CORTÉS A. Predictive monitoring of business processes: a survey[J]. IEEE Transactions on Services Computing, 2018, 11(6): 962-977. |
5 | CHEN J, JING H, CHANG Y, et al. Gated recurrent unit based recurrent neural network for remaining useful life prediction of nonlinear deterioration process[J]. Reliability Engineering and System Safety, 2019, 185: 372-382. |
6 | PASQUADIBISCEGLIE V, APPICE A, CASTELLANO G, et al. Using convolutional neural networks for predictive process analytics[C]// Proceedings of the 2019 International Conference on Process Mining. Piscataway: IEEE, 2019: 129-136. |
7 | TAX N, VERENICH I, LA ROSA M, et al. Predictive business process monitoring with LSTM neural networks[C]// Proceedings of the 2017 International Conference on Advanced Information Systems Engineering, LNCS 10253. Cham: Springer, 2017: 477-492. |
8 | MEHDIYEV N, EVERMANN J, FETTKE P. A multi-stage deep learning approach for business process event prediction[C]// Proceedings of the IEEE 19th Conference on Business Informatics — Volume 1. Piscataway: IEEE, 2017: 119-128. |
9 | BAYRAM F, AHMED B S, KASSLER A. From concept drift to model degradation: an overview on performance-aware drift detectors[J]. Knowledge-Based Systems, 2022, 245: No.108632. |
10 | DELGADO A, MAROTTA A, GONZÁLEZ L, et al. Towards a data science framework integrating process and data mining for organizational improvement[C]// Proceedings of the 15th International Conference on Software Technologies. Setúbal: SciTePress, 2020: 492-500. |
11 | SATO D M V, DE FREITAS S C, BARDDAL J P, et al. A survey on concept drift in process mining[J]. ACM Computing Surveys, 2021, 54(9): No.189. |
12 | LU J, LIU A, DONG F, et al. Learning under concept drift: a review[J]. IEEE Transactions on Knowledge and Data Engineering, 2019, 31(12): 2346-2363. |
13 | GAMA J, MEDAS P, CASTILLO G, et al. Learning with drift detection[C]// Proceedings of the 2004 Brazilian Symposium on Artificial Intelligence, LNCS 3171. Berlin: Springer, 2004: 286-295. |
14 | GU F, ZHANG G, LU J, et al. Concept drift detection based on equal density estimation[C]// Proceedings of the 2016 International Joint Conference on Neural Networks. Piscataway: IEEE, 2016: 24-30. |
15 | YU S, WANG X, PRÍNCIPE J C. Request-and-reverify: hierarchical hypothesis testing for concept drift detection with expensive labels[C]// Proceedings of the 27th International Joint Conference on Artificial Intelligence. California: ijcai.org, 2018:3033-3039. |
16 | RAMA-MANEIRO E, VIDAL J C, LAMA M. Deep learning for predictive business process monitoring: review and benchmark[J]. IEEE Transactions on Services Computing, 2023, 16(1): 739-756. |
17 | SCHÖNIG S, JASINSKI R, ACKERMANN L, et al. Deep learning process prediction with discrete and continuous data features[C]// Proceedings of the 13th International Conference on Evaluation of Novel Approaches to Software Engineering. Setúbal: SciTePress, 2018: 314-319. |
18 | LEONTJEVA A, CONFORTI R, DI FRANCESCOMARINO C, et al. Complex symbolic sequence encodings for predictive monitoring of business processes[C]// Proceedings of the 2015 International Conference on Business Process Management, LNCS 9253. Cham: Springer, 2015: 297-313. |
19 | 王娇娇,马小雨,刘畅,等. 基于XGBoost增量实现业务流程执行结果的预测性监控方法[J]. 计算机集成制造系统,2024,30(8):2756-2775. |
WANG J J, MA X Y, LIU C, et al. Incremental outcome-oriented predictive process monitoring based on XGBoost[J]. Computer Integrated Manufacturing Systems, 2024, 30(8): 2756-2775. | |
20 | YAO L, GUAN Y. An improved LSTM structure for natural language processing[C]// Proceedings of the 2018 IEEE International Conference of Safety Produce Informatization. Piscataway: IEEE, 2018: 565-569. |
21 | SHEWALKAR A, NYAVANANDI D, LUDWIG S A. Performance evaluation of deep neural networks applied to speech recognition: RNN, LSTM and GRU[J]. Journal of Artificial Intelligence and Soft Computing Research, 2019, 9(4): 235-245. |
22 | STAUDEMEYER R C, MORRIS E R. Understanding LSTM — a tutorial into long short-term memory recurrent neural networks[EB/OL]. (2019-09-12) [2013-09-23].. |
23 | 孙笑笑,侯文杰,应钰柯,等. 基于双层机器学习的业务流程剩余时间预测[J]. 计算机学报, 2021,44(11):2283-2294. |
SUN X X, HOU W J, YING Y K, et al. Business process remaining time prediction based on two-layer machine learning[J]. Chinese Journal of Computers, 2021, 44(11):2283-2294. | |
24 | 夏灿铭,邢玛丽,何胜煌. 基于XLNet的业务流程下一活动预测方法[J]. 计算机集成制造系统, 2023, 29(10):3496-3503. |
XIA C M, XING M L, HE S H. XLNet-based next activity prediction method of business process[J]. Computer Integrated Manufacturing Systems, 2023, 29(10):3496-3503. | |
25 | 郑婷婷,陈洁璇,许洋,等. 业务流程中一种个性化的任务完成时间预测方法[J]. 计算机集成制造系统, 2019, 25(4): 993-1000. |
ZHENG T T, CHEN J X, XU Y, et al. Approach for individual task completion time prediction in business processes[J]. Computer Integrated Manufacturing Systems, 2019, 25(4): 993-1000. | |
26 | TEINEMAA I, DUMAS M, LA ROSA M, et al. Outcome-oriented predictive process monitoring: review and benchmark[J]. ACM Transactions on Knowledge Discovery from Data, 2019, 13(2): No.17. |
27 | DU M, LI F, ZHENG G, et al. DeepLog: anomaly detection and diagnosis from system logs through deep learning[C]// Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. New York: ACM, 2017: 1285-1298. |
28 | ATHIWARATKUN B, WILSON A G, ANANDKUMAR A. Probabilistic FastText for multi-sense word embeddings[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg: ACL, 2019: 1-11. |
29 | QAISER S, ALI R. Text mining: use of TF-IDF to examine the relevance of words to documents[J]. International Journal of Computer Applications, 2018, 181(1): 25-29. |
[1] | 赵志强, 马培红, 黑新宏. 基于双重注意力机制的人群计数方法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2886-2892. |
[2] | 秦璟, 秦志光, 李发礼, 彭悦恒. 基于概率稀疏自注意力神经网络的重性抑郁疾患诊断[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2970-2974. |
[3] | 李力铤, 华蓓, 贺若舟, 徐况. 基于解耦注意力机制的多变量时序预测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2732-2738. |
[4] | 薛凯鹏, 徐涛, 廖春节. 融合自监督和多层交叉注意力的多模态情感分析网络[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2387-2392. |
[5] | 汪雨晴, 朱广丽, 段文杰, 李书羽, 周若彤. 基于交互注意力机制的心理咨询文本情感分类模型[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2393-2399. |
[6] | 高鹏淇, 黄鹤鸣, 樊永红. 融合坐标与多头注意力机制的交互语音情感识别[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2400-2406. |
[7] | 李钟华, 白云起, 王雪津, 黄雷雷, 林初俊, 廖诗宇. 基于图像增强的低照度人脸检测[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2588-2594. |
[8] | 莫尚斌, 王文君, 董凌, 高盛祥, 余正涛. 基于多路信息聚合协同解码的单通道语音增强[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2611-2617. |
[9] | 刘丽, 侯海金, 王安红, 张涛. 基于多尺度注意力的生成式信息隐藏算法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2102-2109. |
[10] | 徐松, 张文博, 王一帆. 基于时空信息的轻量视频显著性目标检测网络[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2192-2199. |
[11] | 李大海, 王忠华, 王振东. 结合空间域和频域信息的双分支低光照图像增强网络[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2175-2182. |
[12] | 魏文亮, 王阳萍, 岳彪, 王安政, 张哲. 基于光照权重分配和注意力的红外与可见光图像融合深度学习模型[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2183-2191. |
[13] | 熊武, 曹从军, 宋雪芳, 邵云龙, 王旭升. 基于多尺度混合域注意力机制的笔迹鉴别方法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2225-2232. |
[14] | 李欢欢, 黄添强, 丁雪梅, 罗海峰, 黄丽清. 基于多尺度时空图卷积网络的交通出行需求预测[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2065-2072. |
[15] | 毛典辉, 李学博, 刘峻岭, 张登辉, 颜文婧. 基于并行异构图和序列注意力机制的中文实体关系抽取模型[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2018-2025. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||