Journal of Computer Applications ›› 2025, Vol. 45 ›› Issue (9): 3057-3066.DOI: 10.11772/j.issn.1001-9081.2025020202

• Frontier and comprehensive applications • Previous Articles    

Customer churn prediction model integrating hierarchical graph neural network and specific feature learning

Yanqun LU1, Yiyi ZHAO2()   

  1. 1.Chengdu Administration of Governance,Chengdu Sichuan 610110,China
    2.Research Institute of Big Data,Southwestern University of Finance and Economics,Chengdu Sichuan 611130,China
  • Received:2025-03-04 Revised:2025-04-14 Accepted:2025-04-28 Online:2025-05-16 Published:2025-09-10
  • Contact: Yiyi ZHAO
  • About author:LU Yanqun, born in 1986, Ph. D., lecturer. Her research interests include financial data mining.
  • Supported by:
    Humanities and Social Sciences Planning Fund Project of the Ministry of Education(21YJA630122)

基于层次图神经网络和差异化特征学习的客户流失预测模型

卢燕群1, 赵奕奕2()   

  1. 1.成都行政学院,成都 610110
    2.西南财经大学 大数据研究院,成都 611130
  • 通讯作者: 赵奕奕
  • 作者简介:卢燕群(1986—),女,四川成都人,讲师,博士,主要研究方向:金融数据挖掘
  • 基金资助:
    教育部人文社会科学规划基金资助项目(21YJA630122)

Abstract:

To address the severity of customer churn in the inclusive finance field and the shortcomings of the existing customer retention models in prediction accuracy and interpretability, a customer churn prediction model integrating Hierarchical Graph Neural Network (HGNN) and Specific Feature Learning (SFL), HGNN-SFLN (HGNN-SFL Network), was proposed to enhance the model’s prediction capability and understanding of feature interactions. Firstly, to address the data imbalance issue, an innovative hybrid sampling strategy was introduced, and feature-level weighted adjustments for different feature categories were implemented to ensure the effective utilization of all data types. Secondly, a hierarchical graph was utilized to strengthen correlations between different features, and an SFL module based on a self-attention mechanism was constructed to improve the model’s ability to process categorical features and analyze feature interaction relationships. Through this module, accurate identification of key features and effective capturing of complex interaction relationships between them were enabled by the model, thereby optimizing the prediction decision-making process. Experimental results demonstrate that the proposed model achieves optimal results on multiple real-world financial datasets compared to mainstream models such as Light GBM (Light Gradient Boosting Machine) and Deep Neural Network (DNN)in key indicators such as Area Under Curve (AUC). Furthermore, the proposed model has significant advantages over the comparison models in the accurate identification of critical churn-related features and the effective capturing of complex feature interaction relationships.

Key words: customer churn prediction, data imbalance, feature interaction modeling, specific feature, Hierarchical Graph Neural Network (HGNN)

摘要:

针对普惠金融领域客户流失问题的严峻性及现有客户挽留模型在预测精度与可解释性上的不足,提出一种基于层次图神经网络(HGNN)和差异化特征学习(SFL)的客户流失预测模型HGNN-SFLN (HGNN-SFL Network),以提升模型的预测能力和对特征交互的理解。首先,为了应对数据不平衡问题,提出一种混合采样策略,并在特征层面对不同类别的特征进行加权调整,以确保各类数据的有效利用;其次,利用层次图强化不同特征之间的关联性,并构建一种基于自注意力机制的SFL模块,以增强模型对分类特征的处理能力及特征交互关系的解析能力。通过该模块,模型能够精准识别关键特征,并有效捕捉它们之间的复杂交互关系,从而优化预测决策过程。实验结果表明,所提模型在多个真实金融数据集上相较于主流模型,如Light GBM(Light Gradient Boosting Machine)和深度神经网络(DNN),在曲线下面积(AUC)等关键指标上都取得了最优结果,并且在精确识别关键流失特征以及有效捕捉特征间的复杂交互关系方面,相较于对比模型展现出显著的优势。

关键词: 客户流失预测, 数据不平衡, 特征交互建模, 差异化特征, 层次图神经网络

CLC Number: