《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (11): 3307-3321.DOI: 10.11772/j.issn.1001-9081.2021122060

• 第九届CCF大数据学术会议 •    下一篇


李蒙蒙1, 刘艺1(), 李庚松1, 郑奇斌2, 秦伟1, 任小广1   

  1. 1.军事科学院 国防科技创新研究院,北京 100071
    2.军事科学院,北京 100091
  • 收稿日期:2021-12-06 修回日期:2021-12-30 接受日期:2022-01-18 发布日期:2022-03-04 出版日期:2022-11-10
  • 通讯作者: 刘艺
  • 作者简介:李蒙蒙(1992—),女,河北邯郸人,硕士研究生,主要研究方向:数据质量、演化算法
    刘艺(1990—),男(回族),安徽蚌埠人,助理研究员,博士,主要研究方向:机器人操作系统、数据质量、演化算法 albertliu20th@163.com
  • 基金资助:

Survey on imbalanced multi‑class classification algorithms

Mengmeng LI1, Yi LIU1(), Gengsong LI1, Qibin ZHENG2, Wei QIN1, Xiaoguang REN1   

  1. 1.Defense Innovation Institute,Academy of Military Science,Beijing 100071,China
    2.Academy of Military Science,Beijing 100091,China
  • Received:2021-12-06 Revised:2021-12-30 Accepted:2022-01-18 Online:2022-03-04 Published:2022-11-10
  • Contact: Yi LIU
  • About author:LI Mengmeng, born in 1992, M. S. candidate. Her research interests include data quality, evolutionary algorithms.
    LIU Yi, born in 1990, Ph. D., research assistant. His research interests include robot operating system, data quality, evolutionary algorithms.
    LI Gengsong, born in 1999, M. S. candidate. His research interests include big data, algorithm selection.
    ZHENG Qibin, born in 1990, Ph. D., research assistant. His research interests include data engineering, data mining, machine learning.
    QIN Wei, born in 1983, M. S., research assistant. His research interests include intelligent information system management.
    REN Xiaoguang, born in 1986, Ph. D., associate research fellow. His research interests include robot operation system, high‑performance computing, numerical computation and simulation.
  • Supported by:
    National Natural Science Foundation of China(61802426)



关键词: 不平衡分类, 多类别分类, 不平衡多分类, 分类算法, 机器学习


Imbalanced data classification is an important research content in machine learning, but most of the existing imbalanced data classification algorithms foucus on binary classification, and there are relatively few studies on imbalanced multi?class classification. However, datasets in practical applications usually have multiple classes and imbalanced data distribution, and the diversity of classes further increases the difficulty of imbalanced data classification, so the multi?class classification problem has become a research topic to be solved urgently. The imbalanced multi?class classification algorithms proposed in recent years were reviewed. According to whether the decomposition strategy was adopted, imbalanced multi?class classification algorithms were divided into decomposition methods and ad?hoc methods. Furthermore, according to the different adopted decomposition strategies, the decomposition methods were divided into two frameworks: One Vs. One (OVO) and One Vs. All (OVA). And according to different used technologies, the ad?hoc methods were divided into data?level methods, algorithm?level methods, cost?sensitive methods, ensemble methods and deep network?based methods. The advantages and disadvantages of these methods and their representative algorithms were systematically described, the evaluation indicators of imbalanced multi?class classification methods were summarized, the performance of the representative methods were deeply analyzed through experiments, and the future development directions of imbalanced multi?class classification were discussed.

Key words: imbalanced classification, multi?class classification, imbalanced multi?class classification, classification algorithm, machine learning
