Journal of Computer Applications ›› 2025, Vol. 45 ›› Issue (9): 3057-3066.DOI: 10.11772/j.issn.1001-9081.2025020202
• Frontier and comprehensive applications • Previous Articles
Received:
2025-03-04
Revised:
2025-04-14
Accepted:
2025-04-28
Online:
2025-05-16
Published:
2025-09-10
Contact:
Yiyi ZHAO
About author:
LU Yanqun, born in 1986, Ph. D., lecturer. Her research interests include financial data mining.
Supported by:
通讯作者:
赵奕奕
作者简介:
卢燕群(1986—),女,四川成都人,讲师,博士,主要研究方向:金融数据挖掘
基金资助:
CLC Number:
Yanqun LU, Yiyi ZHAO. Customer churn prediction model integrating hierarchical graph neural network and specific feature learning[J]. Journal of Computer Applications, 2025, 45(9): 3057-3066.
卢燕群, 赵奕奕. 基于层次图神经网络和差异化特征学习的客户流失预测模型[J]. 《计算机应用》唯一官方网站, 2025, 45(9): 3057-3066.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2025020202
数据集 | 样本总数 | 正样本 占比/% | 特征总数 | 连续型 特征数 | 离散型 特征数 |
---|---|---|---|---|---|
数据集A | 165 034 | 5.8 | 27 | 22 | 5 |
数据集B | 750 000 | 4.5 | 624 | 508 | 116 |
Tab. 1 Descriptive statistical analysis of datasets
数据集 | 样本总数 | 正样本 占比/% | 特征总数 | 连续型 特征数 | 离散型 特征数 |
---|---|---|---|---|---|
数据集A | 165 034 | 5.8 | 27 | 22 | 5 |
数据集B | 750 000 | 4.5 | 624 | 508 | 116 |
数据集 | 模型 | AUC | F1-Score | AP | KS |
---|---|---|---|---|---|
A | Light GBM | 87.77 | 64.97 | 71.12 | 59.05 |
DNN | 87.69 | 65.02 | 70.29 | 59.09 | |
CNN | 87.69 | 64.85 | 70.59 | 59.01 | |
ResNet | 87.71 | 64.65 | 69.85 | 59.46 | |
GCN | 80.64 | 57.33 | 60.58 | 47.52 | |
GAT | 86.78 | 63.89 | 68.79 | 57.81 | |
TabNet | 87.00 | 63.40 | 69.05 | 57.91 | |
SFLN | 88.82 | 66.68 | 73.38 | 61.28 | |
HGNN-SFLN | 89.69 | 67.13 | 73.84 | 59.75 | |
B | Light GBM | 90.99 | 58.56 | 59.96 | 67.97 |
DNN | 92.07 | 55.17 | 57.27 | 68.24 | |
CNN | 92.33 | 56.46 | 60.10 | 68.76 | |
ResNet | 90.23 | 58.38 | 59.10 | 67.62 | |
GCN | 85.75 | 51.36 | 54.58 | 62.52 | |
GAT | 89.08 | 51.89 | 56.79 | 61.81 | |
TabNet | 92.80 | 57.28 | 59.59 | 70.13 | |
SFLN | 93.51 | 62.94 | 67.18 | 71.90 | |
HGNN-SFLN | 93.79 | 63.88 | 67.33 | 71.21 |
Tab. 2 Comparison of results of different models on datasets A and B
数据集 | 模型 | AUC | F1-Score | AP | KS |
---|---|---|---|---|---|
A | Light GBM | 87.77 | 64.97 | 71.12 | 59.05 |
DNN | 87.69 | 65.02 | 70.29 | 59.09 | |
CNN | 87.69 | 64.85 | 70.59 | 59.01 | |
ResNet | 87.71 | 64.65 | 69.85 | 59.46 | |
GCN | 80.64 | 57.33 | 60.58 | 47.52 | |
GAT | 86.78 | 63.89 | 68.79 | 57.81 | |
TabNet | 87.00 | 63.40 | 69.05 | 57.91 | |
SFLN | 88.82 | 66.68 | 73.38 | 61.28 | |
HGNN-SFLN | 89.69 | 67.13 | 73.84 | 59.75 | |
B | Light GBM | 90.99 | 58.56 | 59.96 | 67.97 |
DNN | 92.07 | 55.17 | 57.27 | 68.24 | |
CNN | 92.33 | 56.46 | 60.10 | 68.76 | |
ResNet | 90.23 | 58.38 | 59.10 | 67.62 | |
GCN | 85.75 | 51.36 | 54.58 | 62.52 | |
GAT | 89.08 | 51.89 | 56.79 | 61.81 | |
TabNet | 92.80 | 57.28 | 59.59 | 70.13 | |
SFLN | 93.51 | 62.94 | 67.18 | 71.90 | |
HGNN-SFLN | 93.79 | 63.88 | 67.33 | 71.21 |
模型 | AUC | F1-Score | AP | KS |
---|---|---|---|---|
SFLN | 93.51 | 62.94 | 67.18 | 71.90 |
HGNN-SFLN-2H | 88.44 | 37.68 | 28.08 | 61.59 |
HGNN-SFLN-4H | 91.30 | 48.33 | 51.69 | 63.45 |
HGNN-SFLN-8H | 93.79 | 63.88 | 67.33 | 71.21 |
HGNN-SFLN-16H | 85.36 | 37.02 | 25.44 | 59.80 |
Tab. 3 Influence of different parameters on models
模型 | AUC | F1-Score | AP | KS |
---|---|---|---|---|
SFLN | 93.51 | 62.94 | 67.18 | 71.90 |
HGNN-SFLN-2H | 88.44 | 37.68 | 28.08 | 61.59 |
HGNN-SFLN-4H | 91.30 | 48.33 | 51.69 | 63.45 |
HGNN-SFLN-8H | 93.79 | 63.88 | 67.33 | 71.21 |
HGNN-SFLN-16H | 85.36 | 37.02 | 25.44 | 59.80 |
方法 | AUC | F1-Score | AP | KS |
---|---|---|---|---|
不做处理 | 84.57 | 49.88 | 53.87 | 59.11 |
过采样 | 90.11 | 47.09 | 43.51 | 66.41 |
欠采样 | 87.01 | 44.53 | 46.55 | 62.53 |
混合采样 | 89.52 | 50.83 | 54.89 | 68.00 |
本文方法 | 93.51 | 62.94 | 67.18 | 71.90 |
Tab. 4 Comparison of data imbalance processing methods
方法 | AUC | F1-Score | AP | KS |
---|---|---|---|---|
不做处理 | 84.57 | 49.88 | 53.87 | 59.11 |
过采样 | 90.11 | 47.09 | 43.51 | 66.41 |
欠采样 | 87.01 | 44.53 | 46.55 | 62.53 |
混合采样 | 89.52 | 50.83 | 54.89 | 68.00 |
本文方法 | 93.51 | 62.94 | 67.18 | 71.90 |
[1] | 沈艳,江弘毅,胡诗云,等. 数字金融支持高质量发展:理论、机制和证据[J]. 金融研究, 2024(7): 20-39. |
SHEN Y, JIANG H Y, HU S Y, et al. Digital finance supports high-quality development: theory, mechanism and evidence [J]. Journal of Financial Research, 2024(7): 20-39. | |
[2] | LEMMENS A, CROUX C. Bagging and boosting classification trees to predict churn [J]. Journal of Marketing Research, 2006, 43(2): 276-286. |
[3] | XIE Y, LI X, NGAI E W T, et al. Customer churn prediction using improved balanced random forests [J]. Expert Systems with Applications, 2009, 36(3 Pt 1): 5445-5449. |
[4] | RAEDER T, HOENS T R, CHAWLA N V. Consequences of variability in classifier performance estimates [C]// Proceedings of the 2010 IEEE International Conference on Data Mining. Piscataway: IEEE, 2010: 421-430. |
[5] | DOMINGOS E, OJEME B, DARAMOLA O. Experimental analysis of hyperparameters for deep learning-based churn prediction in the banking sector [J]. Computation, 2021, 9(3): No.34. |
[6] | 张嵌嵌,何利力. 基于ResNet和DF融合的用户购买预测算法研究[J]. 软件工程与应用, 2022, 11(1): 50-59. |
ZHANG Q Q, HE L L. Research on user purchase prediction algorithm based on the fusion of ResNet and DF [J]. Software Engineering and Applications, 2022, 11(1): 50-59. | |
[7] | LIU Y, MU S, GU J, et al. Intelligent prediction of customer churn with a fused attentional deep learning model [J]. Mathematics, 2022, 10(24): No.4733. |
[8] | 刘天畅,王雷,朱庆华. 基于SHAP解释方法的智慧居家养老服务平台用户流失预测研究[J]. 数据分析与知识发现, 2024, 8(1): 40-54. |
LIU T C, WANG L, ZHU Q H. Predicting user churn of smart home-based care services based on SHAP interpretation [J]. Data Analysis and Knowledge Discovery, 2024, 8(1): 40-54. | |
[9] | 梁龙跃,王浩竹. 基于图卷积神经网络的个人信用风险预测[J]. 计算机工程与应用, 2023, 59(17): 275-285. |
LIANG L Y, WANG H Z. Personal credit risk prediction based on graph convolutional neural network [J]. Computer Engineering and Applications, 2023, 59(17): 275-285. | |
[10] | XIONG X, ZHANG D, XU D, et al. A new method of financial multivariate time series forecasting based on complex network attention mechanism[EB/OL]. [2024-12-02]. . |
[11] | 魏少朋,梁婷,赵宇,等. 面向企业信用风险评估的多视角异质图神经网络方法[J]. 计算机研究与发展, 2024, 61(8): 1957-1967. |
WEI S P, LIANG T, ZHAO Y, et al. Multi-view heterogeneous graph neural network method for enterprise credit risk assessment [J]. Journal of Computer Research and Development, 2024, 61(8): 1957-1967. | |
[12] | 段刚龙,王妍,马鑫,等. 银行客户分类的数据特征选择方法与实证研究[J]. 计算机工程与应用, 2022, 58(11): 302-312. |
DUAN G L, WANG Y, MA X, et al. Data feature selection method and empirical research of bank customer segmentation [J]. Computer Engineering and Applications, 2022, 58(11): 302-312. | |
[13] | 刘政昊,张志剑,陈帅朴,等. 面向金融领域的风险事件演化关系建模与表示方法研究[J]. 数据分析与知识发现, 2023, 7(8): 78-94. |
LIU Z H, ZHANG Z J, CHEN S P, et al. Modelling and representation of risk event evolution in financial field [J]. Data Analysis and Knowledge Discovery, 2023, 7(8): 78-94. | |
[14] | 马文星,王锋,韦晓. 金融场景下大数据建模常见的数据质量问题及应对策略研究[J]. 质量与认证, 2024(7): 54-57. |
MA W X, WANG F, WEI X. Research on common data quality issues and countermeasures in big data modeling in financial scenarios[J]. China Quality Certification, 2024(7): 54-57. | |
[15] | 顾天下,刘勤明. 面向高维和不平衡数据的供应链金融信用评价[J]. 计算机应用研究, 2022, 39(11): 3396-3401. |
GU T X, LIU Q M. Credit evaluation of supply chain finance for high-dimensional and unbalanced data [J]. Application Research of Computers, 2022, 39(11): 3396-3401. | |
[16] | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need [C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2017: 6000-6010. |
[17] | SCARSELLI F, GORI M, TSOI A C, et al. The graph neural network model [J]. IEEE Transactions on Neural Networks, 2009, 20(1): 61-80. |
[18] | SCHLICHTKRULL M, KIPF T N, BLOEM P, et al. Modeling relational data with graph convolutional networks [C]// Proceedings of the 2018, European Semantic Web Conference, LNCS 10843. Cham: Springer, 2018: 593-607. |
[19] | HU Z, DONG Y, WANG K, et al. Heterogeneous graph Transformer [C]// Proceedings of the Web Conference 2020. New York: ACM, 2020: 2704-2710. |
[20] | BI W, DU L, FU Q, et al. MM-GNN: mix-moment graph neural network towards modeling neighborhood feature distribution [C]// Proceedings of the 16th ACM International Conference on Web Search and Data Mining. New York: ACM, 2023: 132-140. |
[21] | JIAO L, CHEN J, LIU F, et al. Graph representation learning meets computer vision: a survey [J]. IEEE Transactions on Artificial Intelligence, 2023, 4(1): 2-22. |
[22] | GAO Y, ZHANG P, ZHOU C, et al. HGNAS++: efficient architecture search for heterogeneous graph neural networks [J]. IEEE Transactions on Knowledge and Data Engineering, 2023, 35(9): 9448-9461. |
[23] | 朱诗能,韩萌,杨书蓉,等. 不平衡数据流的集成分类方法综述[J]. 计算机工程与应用, 2025, 61(2): 59-72. |
ZHU S N, HAN M, YANG S R, et al. Ensemble classification methods for unbalanced data streams [J]. Computer Engineering and Applications, 2025, 61(2): 59-72. | |
[24] | 周捷,严建峰,杨璐,等. LSTM模型集成方法在客户流失预测中的应用[J]. 计算机应用与软件, 2019, 36(11): 39-46. |
ZHOU J, YAN J F, YANG L, et al. Application of LSTM ensemble method in customer churn prediction [J]. Computer Applications and Software, 2019, 36(11): 39-46. | |
[25] | 费振华. 基于机器学习的不平衡数据下个人信用评分预测模型研究[J]. 长江信息通信, 2024, 37(4): 112-114. |
FEI Z H. Research on personal credit rating prediction model based on machine learning for unbalanced data [J]. Changjiang Information and Communications, 2024, 37(4): 112-114. | |
[26] | 史明华,吴广潮. 基于聚类混合采样的不平衡数据分类[J]. 计算机与现代化, 2020(5): 34-38. |
SHI M H, WU G C. An Imbalanced data classification of hybrid sampling based on clustering [J]. Computers and Modernization, 2020(5): 34-38. | |
[27] | 郑建华,李小敏,刘双印,等. 融合级联上采样与下采样的改进随机森林不平衡数据分类算法[J]. 计算机科学, 2021, 48(7): 145-154. |
ZHENG J H, LI X M, LIU S Y, et al. Improved random forest imbalance data classification algorithm combining cascaded up-sampling and down-sampling[J]. Computer Science, 2021, 48(7): 145-154. | |
[28] | LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection [C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 2999-3007. |
[29] | ABADAL S, JAIN A, GUIRADO R, et al. Computing graph neural networks: a survey from algorithms to accelerators [J]. ACM Computing Surveys, 2021, 54(9): 1-38. |
[30] | 田芳. 基于机器学习模型的信用卡客户流失预测[J]. 电子商务评论, 2025, 14(2): 699-708. |
TIAN F. Forecast of credit card customer attrition based on the machine learning model [J]. E-Commerce Letters, 2025, 14(2): 699-708. | |
[31] | 殷林飞,蒙雨洁. 基于DenseNet卷积神经网络的短期风电预测方法[J]. 综合智慧能源, 2024, 46(7): 12-20. |
YIN L F, MENG Y J. Short-term wind power forecasting based on DenseNet convolutional neural networks [J]. Integrated Intelligent Energy, 2024, 46(7): 12-20. | |
[32] | ARIK S Ö, PFISTER T. TabNet: attentive interpretable tabular learning [C]// Proceedings of the 35th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2021: 6679-6687. |
[1] | Haijun GENG, Yun DONG, Zhiguo HU, Haotian CHI, Jing YANG, Xia YIN. Encrypted traffic classification method based on Attention-1DCNN-CE [J]. Journal of Computer Applications, 2025, 45(3): 872-882. |
[2] | Pengqi GAO, Heming HUANG, Yonghong FAN. Fusion of coordinate and multi-head attention mechanisms for interactive speech emotion recognition [J]. Journal of Computer Applications, 2024, 44(8): 2400-2406. |
[3] | Bin XIAO, Yun GAN, Min WANG, Xingpeng ZHANG, Zhaoxing WANG. Network abnormal traffic detection based on port attention and convolutional block attention module [J]. Journal of Computer Applications, 2024, 44(4): 1027-1034. |
[4] | Yi JIANG, Shuping WU, Kun HU, Linbo LONG. Imbalanced data classification method based on Lasso and constructive covering algorithm [J]. Journal of Computer Applications, 2023, 43(4): 1086-1093. |
[5] | Jing LIU, Zhihong DONG, Zheyu ZHANG, Zhigang SUN, Haipeng JI. Data sharing method of industrial internet of things based on federal incremental learning [J]. Journal of Computer Applications, 2022, 42(4): 1235-1243. |
[6] | YAN Haisheng, MA Xinqiang. Feature construction algorithm for multi-target regression via radial basis function [J]. Journal of Computer Applications, 2021, 41(8): 2219-2224. |
[7] | ZHANG Zhihao, LIN Yaojin, LU Shun, GUO Chen, WANG Chenxi. Multi-label feature selection based on label-specific feature with missing labels [J]. Journal of Computer Applications, 2021, 41(10): 2849-2857. |
[8] | YANG Hongyu, LI Bochao. Network abnormal behavior detection model based on adversarially learned inference [J]. Journal of Computer Applications, 2019, 39(7): 1967-1972. |
[9] | ZHANG Yu, YU Dongjun. Protein-ATP binding site prediction based on 1D-convolutional neural network [J]. Journal of Computer Applications, 2019, 39(11): 3146-3150. |
[10] | JIAN Yiheng, YU Xiao. Software defect number prediction method based on data oversampling and ensemble learning [J]. Journal of Computer Applications, 2018, 38(9): 2637-2643. |
[11] | WANG Lin, GUO Nana. Imbalanced telecom customer data classification method based on dissimilarity [J]. Journal of Computer Applications, 2017, 37(4): 1032-1037. |
[12] | XU Suping, YANG Xibei, QI Yunsong. Multi-label learning with label-specific feature reduction [J]. Journal of Computer Applications, 2015, 35(11): 3218-3221. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||