Journal of Computer Applications ›› 2022, Vol. 42 ›› Issue (1): 280-286.DOI: 10.11772/j.issn.1001-9081.2021020306
• Frontier and comprehensive applications • Previous Articles Next Articles
Lu ZHANG, Jiapeng LIU(), Dongmei TIAN
Received:
2021-03-02
Revised:
2021-06-21
Accepted:
2021-06-23
Online:
2022-01-11
Published:
2022-01-10
Contact:
Jiapeng LIU
About author:
ZHANG Lu, born in 1995, M. S. candidate. Her research interests include corporate finance, data mining.Supported by:
通讯作者:
刘家鹏
作者简介:
张露(1995—),女,浙江宁波人,硕士研究生,主要研究方向:公司金融、数据挖掘基金资助:
CLC Number:
Lu ZHANG, Jiapeng LIU, Dongmei TIAN. Application of Stacking-Bagging-Vote multi-source information fusion model for financial early warning[J]. Journal of Computer Applications, 2022, 42(1): 280-286.
张露, 刘家鹏, 田冬梅. 基于Stacking-Bagging-Vote多源信息融合模型的财务预警应用[J]. 《计算机应用》唯一官方网站, 2022, 42(1): 280-286.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2021020306
预测类别 | 真实类别 | |
---|---|---|
P | N | |
P | TP | FN |
N | FP | TN |
Tab.1 Matrix of classification results
预测类别 | 真实类别 | |
---|---|---|
P | N | |
P | TP | FN |
N | FP | TN |
类型 | 样本数 | 占比/% | Y取值 |
---|---|---|---|
合计 | 2 936 | 100.00 | |
AB | 31 | 1.05 | 1 |
AD | 51 | 1.74 | 1 |
AX | 2 | 0.07 | 1 |
AA | 2 852 | 97.14 | 0 |
Tab.2 Distribution of samples
类型 | 样本数 | 占比/% | Y取值 |
---|---|---|---|
合计 | 2 936 | 100.00 | |
AB | 31 | 1.05 | 1 |
AD | 51 | 1.74 | 1 |
AX | 2 | 0.07 | 1 |
AA | 2 852 | 97.14 | 0 |
采样算法 | 集成模型 | 召回率 | 精确率 | F1值 | G-mean |
---|---|---|---|---|---|
集成Up-Down | BV-EN | 0.761 9 | 0.113 3 | 0.197 2 | 0.793 1 |
BV-RF | 0.785 7 | 0.370 8 | 0.503 8 | 0.873 2 | |
BV-XGBoost | 0.845 2 | 0.282 9 | 0.423 9 | 0.891 1 | |
BV-Models | 0.869 0 | 0.298 0 | 0.443 8 | 0.904 4 | |
Stacking-DT | 0.833 3 | 0.123 0 | 0.214 4 | 0.829 2 | |
Stacking-SVM | 0.940 5 | 0.092 7 | 0.168 8 | 0.834 7 | |
Stacking-LR | 0.881 0 | 0.129 6 | 0.226 0 | 0.853 3 | |
SMOTE | BV-EN | 0.535 7 | 0.211 3 | 0.303 0 | 0.710 0 |
BV-RF | 0.821 4 | 0.249 1 | 0.382 3 | 0.872 7 | |
BV-XGBoost | 0.821 4 | 0.227 7 | 0.356 6 | 0.868 4 | |
BV-Models | 0.654 8 | 0.348 1 | 0.794 4 | 0.454 5 | |
Stacking-DT | 0.904 8 | 0.160 3 | 0.272 4 | 0.882 3 | |
Stacking-SVM | 0.928 6 | 0.139 3 | 0.242 2 | 0.879 8 | |
Stacking-LR | 0.904 8 | 0.191 9 | 0.316 7 | 0.896 2 | |
Tomek-Smote | BV-EN | 0.571 4 | 0.151 4 | 0.239 4 | 0.738 5 |
BV-RF | 0.845 2 | 0.229 8 | 0.361 3 | 0.880 2 | |
BV-XGBoost | 0.809 5 | 0.191 0 | 0.309 1 | 0.853 1 | |
BV-Models | 0.607 1 | 0.377 8 | 0.767 6 | 0.465 8 | |
Stacking-DT | 0.881 0 | 0.183 2 | 0.303 3 | 0.882 6 | |
Stacking-SVM | 0.916 7 | 0.138 5 | 0.240 6 | 0.874 4 | |
Stacking-LR | 0.916 7 | 0.201 0 | 0.329 8 | 0.904 6 |
Tab.3 Model prediction results based on different sampling algorithms
采样算法 | 集成模型 | 召回率 | 精确率 | F1值 | G-mean |
---|---|---|---|---|---|
集成Up-Down | BV-EN | 0.761 9 | 0.113 3 | 0.197 2 | 0.793 1 |
BV-RF | 0.785 7 | 0.370 8 | 0.503 8 | 0.873 2 | |
BV-XGBoost | 0.845 2 | 0.282 9 | 0.423 9 | 0.891 1 | |
BV-Models | 0.869 0 | 0.298 0 | 0.443 8 | 0.904 4 | |
Stacking-DT | 0.833 3 | 0.123 0 | 0.214 4 | 0.829 2 | |
Stacking-SVM | 0.940 5 | 0.092 7 | 0.168 8 | 0.834 7 | |
Stacking-LR | 0.881 0 | 0.129 6 | 0.226 0 | 0.853 3 | |
SMOTE | BV-EN | 0.535 7 | 0.211 3 | 0.303 0 | 0.710 0 |
BV-RF | 0.821 4 | 0.249 1 | 0.382 3 | 0.872 7 | |
BV-XGBoost | 0.821 4 | 0.227 7 | 0.356 6 | 0.868 4 | |
BV-Models | 0.654 8 | 0.348 1 | 0.794 4 | 0.454 5 | |
Stacking-DT | 0.904 8 | 0.160 3 | 0.272 4 | 0.882 3 | |
Stacking-SVM | 0.928 6 | 0.139 3 | 0.242 2 | 0.879 8 | |
Stacking-LR | 0.904 8 | 0.191 9 | 0.316 7 | 0.896 2 | |
Tomek-Smote | BV-EN | 0.571 4 | 0.151 4 | 0.239 4 | 0.738 5 |
BV-RF | 0.845 2 | 0.229 8 | 0.361 3 | 0.880 2 | |
BV-XGBoost | 0.809 5 | 0.191 0 | 0.309 1 | 0.853 1 | |
BV-Models | 0.607 1 | 0.377 8 | 0.767 6 | 0.465 8 | |
Stacking-DT | 0.881 0 | 0.183 2 | 0.303 3 | 0.882 6 | |
Stacking-SVM | 0.916 7 | 0.138 5 | 0.240 6 | 0.874 4 | |
Stacking-LR | 0.916 7 | 0.201 0 | 0.329 8 | 0.904 6 |
排序指标 | 融合模型 | 召回率 | 精确率 | F1值 | G-mean |
---|---|---|---|---|---|
召回率 | SBV-MF-R1 | 0.976 2 | 0.132 7 | 0.233 6 | 0.894 1 |
SBV-MF-R2 | 0.976 2 | 0.119 7 | 0.213 3 | 0.882 4 | |
SBV-M-R1 | 0.964 3 | 0.143 9 | 0.250 4 | 0.897 6 | |
SBV-M-R2 | 0.964 3 | 0.142 9 | 0.248 8 | 0.896 9 | |
SBV-S-R1 | 0.964 3 | 0.130 6 | 0.230 1 | 0.887 6 | |
SBV-S-R2 | 0.964 3 | 0.120 5 | 0.214 3 | 0.878 5 | |
精确率 | SBV-S-P1 | 0.916 7 | 0.269 2 | 0.416 2 | 0.921 7 |
SBV-S-P2 | 0.857 1 | 0.268 7 | 0.409 1 | 0.894 2 | |
SBV-M-P1 | 0.904 8 | 0.255 9 | 0.399 0 | 0.913 6 | |
SBV-M-P2 | 0.928 6 | 0.242 2 | 0.384 2 | 0.921 5 | |
SBV-MF-P1 | 0.940 5 | 0.161 2 | 0.275 3 | 0.898 2 | |
SBV-MF-P2 | 0.964 3 | 0.141 9 | 0.247 3 | 0.896 2 | |
G-mean | SBV-S-G1 | 0.928 6 | 0.246 8 | 0.390 0 | 0.922 6 |
SBV-S-G2 | 0.916 7 | 0.269 2 | 0.416 2 | 0.921 7 | |
SBV-M-G1 | 0.928 6 | 0.242 2 | 0.384 2 | 0.921 5 | |
SBV-M-G2 | 0.940 5 | 0.212 4 | 0.346 5 | 0.918 9 | |
SBV-MF-G1 | 0.940 5 | 0.161 2 | 0.275 3 | 0.898 2 | |
SBV-MF-G2 | 0.964 3 | 0.141 9 | 0.247 3 | 0.896 2 |
Tab. 4 Prediction results based on different ranking indexes
排序指标 | 融合模型 | 召回率 | 精确率 | F1值 | G-mean |
---|---|---|---|---|---|
召回率 | SBV-MF-R1 | 0.976 2 | 0.132 7 | 0.233 6 | 0.894 1 |
SBV-MF-R2 | 0.976 2 | 0.119 7 | 0.213 3 | 0.882 4 | |
SBV-M-R1 | 0.964 3 | 0.143 9 | 0.250 4 | 0.897 6 | |
SBV-M-R2 | 0.964 3 | 0.142 9 | 0.248 8 | 0.896 9 | |
SBV-S-R1 | 0.964 3 | 0.130 6 | 0.230 1 | 0.887 6 | |
SBV-S-R2 | 0.964 3 | 0.120 5 | 0.214 3 | 0.878 5 | |
精确率 | SBV-S-P1 | 0.916 7 | 0.269 2 | 0.416 2 | 0.921 7 |
SBV-S-P2 | 0.857 1 | 0.268 7 | 0.409 1 | 0.894 2 | |
SBV-M-P1 | 0.904 8 | 0.255 9 | 0.399 0 | 0.913 6 | |
SBV-M-P2 | 0.928 6 | 0.242 2 | 0.384 2 | 0.921 5 | |
SBV-MF-P1 | 0.940 5 | 0.161 2 | 0.275 3 | 0.898 2 | |
SBV-MF-P2 | 0.964 3 | 0.141 9 | 0.247 3 | 0.896 2 | |
G-mean | SBV-S-G1 | 0.928 6 | 0.246 8 | 0.390 0 | 0.922 6 |
SBV-S-G2 | 0.916 7 | 0.269 2 | 0.416 2 | 0.921 7 | |
SBV-M-G1 | 0.928 6 | 0.242 2 | 0.384 2 | 0.921 5 | |
SBV-M-G2 | 0.940 5 | 0.212 4 | 0.346 5 | 0.918 9 | |
SBV-MF-G1 | 0.940 5 | 0.161 2 | 0.275 3 | 0.898 2 | |
SBV-MF-G2 | 0.964 3 | 0.141 9 | 0.247 3 | 0.896 2 |
1 | 孟小峰,杜治娟. 大数据融合研究:问题与挑战[J]. 计算机研究与发展, 2016, 53(2):231-246. 10.7544/issn1000-1239.2016.20150874 |
MENG X F, DU Z J. Research on the big data fusion: issues and challenges[J]. Journal of Computer Research and Development, 2016, 53(2):231-246. 10.7544/issn1000-1239.2016.20150874 | |
2 | BARBOZA F, KIMURA H, ALTMAN E. Machine learning models and bankruptcy prediction[J]. Expert Systems with Applications, 2017, 83: 405-417. 10.1016/j.eswa.2017.04.006 |
3 | 杨剑锋,乔佩蕊,李永梅,等. 机器学习分类问题及算法研究综述[J]. 统计与决策, 2019, 35(6):36-40. 10.13546/j.cnki.tjyjc.2019.06.008 |
YANG J F, QIAO P R, LI Y M, et al. A review of machine-learning classification problems and algorithms[J]. Statistics and Decision, 2019, 35(6):36-40. 10.13546/j.cnki.tjyjc.2019.06.008 | |
4 | HE H B, GARCIA E A. Learning from imbalanced data[J]. IEEE Transactions on Knowledge and Data Engineering, 2009, 21(9):1263-1284. 10.1109/tkde.2008.239 |
5 | CATENI S, COLLA V, VANNUCCI M. A method for resampling imbalanced datasets in binary classification tasks for real-world problems[J]. Neurocomputing, 2014, 135:32-41. 10.1016/j.neucom.2013.05.059 |
6 | VEGANZONES D, SÉVERIN E. An investigation of bankruptcy prediction in imbalanced datasets[J] Decision Support Systems, 2018, 112:111-124. 10.1016/j.dss.2018.06.011 |
7 | TAHIR M A, KITTLER J, MIKOLAJCZYK K, et al. A multiple expert approach to the class imbalance problem using inverse random under sampling[C]// Proceedings of the 2009 Multiple Classifier Systems, LNCS5519. Berlin: Springer, 2009:82-91. |
8 | CHAWLA N V, BOWYER K W, HALL L O, et al. SMOTE: synthetic minority over-sampling technique[J]. Journal of Artificial Intelligence Research, 2002, 16:321-357. 10.1613/jair.953 |
9 | QIAN Y, LIANG Y C, LI M, et al. A resampling ensemble algorithm for classification of imbalance problems[J]. Neurocomputing, 2014, 143:57-67. 10.1016/j.neucom.2014.06.021 |
10 | ZOU H, HASTIE T. Regularization and variable selection via the elastic net[J]. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 2005, 67(2):301-320. 10.1111/j.1467-9868.2005.00503.x |
11 | BREIMAN L. Random forests[J]. Machine Learning, 2001, 45(1):5-32. 10.1023/a:1010933404324 |
12 | CHEN T Q, GUESTRIN C. XGBoost: a scalable tree boosting system[C]// Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2016:785-794. 10.1145/2939672.2939785 |
13 | 徐继伟,杨云. 集成学习方法:研究综述[J]. 云南大学学报(自然科学版), 2018, 40(6):1082-1092. 10.1007/978-3-319-32903-1_103-1 |
XU J W, YANG Y. A survey of ensemble learning approaches[J]. Journal of Yunnan University (Natural Sciences Edition), 2018, 40(6):1082-1092. 10.1007/978-3-319-32903-1_103-1 | |
14 | WOLPERT D H. Stacked generalization[J]. Neural Networks, 1992, 5(2):241-259. 10.1016/s0893-6080(05)80023-1 |
15 | 王奕森,夏树涛. 集成学习之随机森林算法综述[J]. 信息通信技术, 2018, 12(1):49-55. 10.3969/j.issn.1674-1285.2018.01.009 |
WANG Y S, XIA S T. A survey of random forests algorithms[J]. Information and Communication Technologies, 2018, 12(1):49-55. 10.3969/j.issn.1674-1285.2018.01.009 | |
16 | 连克强. 基于Boosting的集成树算法研究与分析[D]. 北京:中国地质大学(北京), 2018:12-32. 10.30919/es8d689 |
LIAN K Q. The study and application of ensemble of trees based on Boosting[D]. Beijing: China University of Geosciences, Beijing, 2018:12-32. 10.30919/es8d689 | |
17 | 朴杨鹤然,任俊玲. 基于Stacking的恶意网页集成检测方法[J]. 计算机应用, 2019, 39(4):1081-1088. 10.11772/j.issn.1001-9081.2018091926 |
PIAOYANG H R, REN J L. Malicious webpage integrated detection method based on Stacking ensemble algorithm[J]. Journal of Computer Applications, 2019, 39(4): 1081-1088. 10.11772/j.issn.1001-9081.2018091926 | |
18 | 丁岚,骆品亮. 基于Stacking集成策略的P2P网贷违约风险预警研究[J]. 投资研究, 2017, 36(4):41-54. |
DING L, LUO P L. Research on default risk early-warning in P2P lending based on Stacking ensemble strategy[J]. Review of Investment Studies, 2017, 36(4):41-54. | |
19 | 任慧玉,陈景华,任凌玉,等. 卡尔曼滤波方法在β估计中的应用[J]. 河南大学学报(自然科学版), 2004, 34(2):10-14. 10.3969/j.issn.1003-4978.2004.02.002 |
REN H Y, CHEN J H, REN L Y, et al. The application of the Kalman filtering in the estimation of the asset’s systemic risk[J]. Journal of Henan University (Natural Science), 2004, 34(2):10-14. 10.3969/j.issn.1003-4978.2004.02.002 |
[1] | Shoulong JIAO, Youxiang DUAN, Qifeng SUN, Zihao ZHUANG, Chenhao SUN. Knowledge representation learning method incorporating entity description information and neighbor node features [J]. Journal of Computer Applications, 2022, 42(4): 1050-1056. |
[2] | Xingshuo DING, Xiang LI, Qian XIE. Enterprise portrait construction method based on label layering and deepening modeling [J]. Journal of Computer Applications, 2022, 42(4): 1170-1177. |
[3] | SHANG Jiandong, LI Panle, LIU Runjie, LI Runchuan. Recommendation model of taxi passenger-finding locations based on weighted non-homogeneous Poisson model [J]. Journal of Computer Applications, 2018, 38(4): 923-927. |
[4] | ZHANG Hengde, XIAN Yunhao, XIE Yonghua, YANG Le, ZHANG Tianhang. Haze forecast based on time series analysis and Kalman filtering [J]. Journal of Computer Applications, 2017, 37(11): 3311-3316. |
[5] | PAN Lei, ZHOU Huan, WANG Minghui. Real-time detection method of abnormal event in crowds [J]. Journal of Computer Applications, 2016, 36(6): 1719-1723. |
[6] | SHI Gang, ZHAO Wei, LIU Shanshan. Battery SOC estimation based on unscented Kalman filtering [J]. Journal of Computer Applications, 2016, 36(12): 3492-3498. |
[7] | REN Hongge, XIANG Yingfan, LI Fujin. Autonomous developmental algorithm for intelligent robot based on intrinsic motivation [J]. Journal of Computer Applications, 2015, 35(9): 2602-2605. |
[8] | QU Haicheng, SHAN Xiaochen, MENG Yu, LIU Wanjun. Improved TLD target tracking algorithm based on automatic adjustment of surveyed areas [J]. Journal of Computer Applications, 2015, 35(10): 2985-2989. |
[9] | ZHU Wenchao XU Dezhang. Model error restoration for lower E-type membrane of six-axis force sensor based on adaptive Kalman filtering [J]. Journal of Computer Applications, 2014, 34(3): 915-920. |
[10] | CHEN Zhifang WANG Jinglei SUN Jingsheng LIU Zhugui SONG Ni GAO Yang. Multi-source irrigation information fusion method based on fuzzy rough set and D-S evidence theory [J]. Journal of Computer Applications, 2013, 33(10): 2811-2814. |
[11] | LIU Hai-yan YANG Chang-yu LIU Chun-ling ZHANG Jin. Tracking algorithm for moving objects based on gradient and color [J]. Journal of Computer Applications, 2012, 32(05): 1265-1268. |
[12] | . Design and implementation of fall detection system using tri-axis accelerometer [J]. Journal of Computer Applications, 2012, 32(05): 1450-1452. |
[13] | . Electronic image stabilization algorithm based on filtering and curve fitting [J]. Journal of Computer Applications, 2010, 30(11): 3008-3010. |
[14] | Wen-Qiang GUO Zhi-Guang QIN. Research on RBFNNaided adaptive UKF algorithm [J]. Journal of Computer Applications, 2009, 29(3): 858-861. |
[15] | Guo-qiang Xiao . Kalman filtering for tracking video objects based on motion estimation [J]. Journal of Computer Applications, 2008, 28(8): 2052-2054. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||