Journal of Computer Applications ›› 2026, Vol. 46 ›› Issue (5): 1475-1481.DOI: 10.11772/j.issn.1001-9081.2025050656

• Data science and technology • Previous Articles    

Competitive loss-driven generative imbalanced node classification

Fengwei CHENG1, Bingqi ZHANG2, Guohua XU1, Wenjian WANG2,3()   

  1. 1.Department of Computer Science and Technology,Taiyuan University,Taiyuan Shanxi 030032,China
    2.Department of Network Security and Protection,Shanxi Police College,Taiyuan Shanxi 030401,China
    3.Key Laboratory of Data Intelligence and Cognitive Computing (Shanxi University),Taiyuan Shanxi 030006,China
  • Received:2025-06-16 Revised:2025-07-13 Accepted:2025-07-22 Online:2025-08-13 Published:2026-05-10
  • Contact: Wenjian WANG
  • About author:CHENG Fengwei, born in 1988, M. S., associate professor. Her research interests include machine learning, artificial intelligence.
    ZHANG Bingqi, born in 1996, M. S., lecturer. Her research interests include machine learning, artificial intelligence.
    XU Guohua, born in 1974, M. S., associate professor. Her research interests include data mining, big data, information security.
  • Supported by:
    Innovation Program for Higher Education Institutions of Shanxi Province(2024L382);Scientific Research Project of Taiyuan University(24TYYB02);National Natural Science Foundation of China(U21A20513)

竞争损失驱动的生成式不平衡节点分类

程凤伟1, 张槟淇2, 徐国华1, 王文剑2,3()   

  1. 1.太原学院 计算机科学与技术系,太原 030032
    2.山西警察学院 网络安全保卫系,太原 030401
    3.数据智能与认知计算山西省重点实验室(山西大学),太原 030006
  • 通讯作者: 王文剑
  • 作者简介:程凤伟(1988—),女,河南周口人,副教授,硕士,CCF会员,主要研究方向:机器学习、人工智能
    张槟淇(1996—),女,山西太原人,讲师,硕士,主要研究方向:机器学习、人工智能
    徐国华(1974—),女,山西忻州人,副教授,硕士,主要研究方向:数据挖掘、大数据、信息安全
  • 基金资助:
    国家自然科学基金资助项目(U21A20513:62476157);山西省高等学校科技创新项目(2024L382);太原学院院级科研项目(24TYYB02)

Abstract:

Graph Neural Networks (GNNs) have achieved significant success in node classification tasks, but their performance typically relies on abundant labeled data in majority classes, which may lead to representation bias for nodes belonging to minority classes with scarce labels. Traditional oversampling techniques mitigate class imbalance by replicating minority samples, but they can easily lead to local neighborhood overfitting. Recent approaches have attempted to synthesize new nodes based on minority-class anchors, but they have failed to fully exploit relationships between minority and adjacent classes, resulting in blurred class boundaries in the generated samples. To address the above challenges, a Competitive loss-driven Generative imbalanced node classification algorithm (GraphCG) was proposed. A feature-structure collaborative auxiliary node selection mechanism was designed to precisely identify auxiliary points from neighboring classes that can enhance class boundaries. Furthermore, a competitive boundary-constrained loss function was constructed to enforce the maintenance of geometric boundary separability between generated nodes and majority classes in the embedding space. Experimental results showed that, compared to current state-of-the-art methods, GraphCG achieved significant improvements across multiple class-imbalanced datasets.GraphCG not only enhances data diversity but also improves class separability, preventing minority classes from being overshadowed by majority classes.

Key words: Graph Neural Network (GNN), node classification, imbalance data, competitive loss, separability

摘要:

图神经网络(GNN)在节点分类任务中取得了显著成功,但它的性能通常依赖多数类丰富的标记数据,对标记稀缺的少数类节点可能存在表征偏差。传统过采样方法通过重复少数类样本可缓解类别不均衡,但容易导致局部邻域过拟合。近期研究尝试基于少数类锚点合成新节点,但未充分挖掘少数类与邻近类之间的关联信息,导致生成样本存在类边界模糊性问题。针对上述挑战,提出一种竞争损失驱动的生成式不平衡节点分类算法(GraphCG),设计特征-结构协同的辅助点筛选机制,通过融合节点特征与局部结构,精准定位可增强类边界的邻居类辅助点,并构造竞争性边界约束损失函数,以强制生成节点与多数类在嵌入空间中保持可分离的几何边界。实验结果表明,相较于当前的先进方法,GraphCG在多个类别不平衡数据集上表现出显著优势。GraphCG不仅可以增强数据的多样性,还可以提升类别可分性,避免少数类被多数类压制。

关键词: 图神经网络, 节点分类, 非平衡数据, 竞争损失, 可分性

CLC Number: