不平衡多分类算法综述

doi:10.11772/j.issn.1001-9081.2021122060

《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (11): 3307-3321.DOI: 10.11772/j.issn.1001-9081.2021122060

所属专题：综述；第九届CCF大数据学术会议(CCF Bigdata 2021)

• 第九届CCF大数据学术会议 • 下一篇

不平衡多分类算法综述

李蒙蒙¹, 刘艺¹(), 李庚松¹, 郑奇斌², 秦伟¹, 任小广¹

^1.军事科学院国防科技创新研究院，北京 100071
^2.军事科学院，北京 100091

收稿日期:2021-12-06 修回日期:2021-12-30 接受日期:2022-01-18 发布日期:2022-03-04 出版日期:2022-11-10
通讯作者: 刘艺
作者简介:李蒙蒙（1992—），女，河北邯郸人，硕士研究生，主要研究方向：数据质量、演化算法
刘艺（1990—），男（回族），安徽蚌埠人，助理研究员，博士，主要研究方向：机器人操作系统、数据质量、演化算法 albertliu20th@163.com
李庚松（1999—），男，湖南长沙人，硕士研究生，主要研究方向：大数据、算法选择
郑奇斌（1990—），男，甘肃兰州人，助理研究员，博士，主要研究方向：数据工程、数据挖掘、机器学习
秦伟（1983—），男，安徽阜阳人，助理研究员，硕士，主要研究方向：智能信息系统管理
任小广（1986—），男，湖北随州人，副研究员，博士，主要研究方向：机器人操作系统、高性能计算、数值计算和模拟。
基金资助:
国家自然科学基金资助项目(61802426)

Survey on imbalanced multi‑class classification algorithms

Mengmeng LI¹, Yi LIU¹(), Gengsong LI¹, Qibin ZHENG², Wei QIN¹, Xiaoguang REN¹

^1.Defense Innovation Institute，Academy of Military Science，Beijing 100071，China
^2.Academy of Military Science，Beijing 100091，China

Received:2021-12-06 Revised:2021-12-30 Accepted:2022-01-18 Online:2022-03-04 Published:2022-11-10
Contact: Yi LIU
About author:LI Mengmeng， born in 1992， M. S. candidate. Her research interests include data quality， evolutionary algorithms.
LIU Yi， born in 1990， Ph. D.， research assistant. His research interests include robot operating system， data quality， evolutionary algorithms.
LI Gengsong， born in 1999， M. S. candidate. His research interests include big data， algorithm selection.
ZHENG Qibin， born in 1990， Ph. D.， research assistant. His research interests include data engineering， data mining， machine learning.
QIN Wei， born in 1983， M. S.， research assistant. His research interests include intelligent information system management.
REN Xiaoguang， born in 1986， Ph. D.， associate research fellow. His research interests include robot operation system， high‑performance computing， numerical computation and simulation.
Supported by:
National Natural Science Foundation of China(61802426)

摘要/Abstract

摘要：

不平衡数据分类是机器学习领域的重要研究内容，但现有的不平衡分类算法通常针对不平衡二分类问题，关于不平衡多分类的研究相对较少。然而实际应用中的数据集通常具有多类别且数据分布具有不平衡性，而类别的多样性进一步加剧了不平衡数据的分类难度，因此不平衡多分类问题已经成为亟待解决的研究课题。针对近年来提出的不平衡多分类算法展开综述，根据是否采用分解策略把不平衡多分类算法分为分解方法和即席方法，并进一步将分解方法按照分解策略的不同划分为“一对一（OVO）”架构和“一对多（OVA）”架构，将即席方法按照处理技术的不同分为数据级方法、算法级方法、代价敏感方法、集成方法和基于深度网络的方法。系统阐述各类方法的优缺点及其代表性算法，总结概括不平衡多分类方法的评价指标，并通过实验深入分析代表性方法的性能，讨论了不平衡多分类的未来发展方向。

关键词: 不平衡分类, 多类别分类, 不平衡多分类, 分类算法, 机器学习

Abstract:

Imbalanced data classification is an important research content in machine learning， but most of the existing imbalanced data classification algorithms foucus on binary classification， and there are relatively few studies on imbalanced multi?class classification. However， datasets in practical applications usually have multiple classes and imbalanced data distribution， and the diversity of classes further increases the difficulty of imbalanced data classification， so the multi?class classification problem has become a research topic to be solved urgently. The imbalanced multi?class classification algorithms proposed in recent years were reviewed. According to whether the decomposition strategy was adopted， imbalanced multi?class classification algorithms were divided into decomposition methods and ad?hoc methods. Furthermore， according to the different adopted decomposition strategies， the decomposition methods were divided into two frameworks： One Vs. One （OVO） and One Vs. All （OVA）. And according to different used technologies， the ad?hoc methods were divided into data?level methods， algorithm?level methods， cost?sensitive methods， ensemble methods and deep network?based methods. The advantages and disadvantages of these methods and their representative algorithms were systematically described， the evaluation indicators of imbalanced multi?class classification methods were summarized， the performance of the representative methods were deeply analyzed through experiments， and the future development directions of imbalanced multi?class classification were discussed.

Key words: imbalanced classification, multi?class classification, imbalanced multi?class classification, classification algorithm, machine learning

中图分类号:

TP391

李蒙蒙, 刘艺, 李庚松, 郑奇斌, 秦伟, 任小广. 不平衡多分类算法综述[J]. 计算机应用, 2022, 42(11): 3307-3321.

Mengmeng LI, Yi LIU, Gengsong LI, Qibin ZHENG, Wei QIN, Xiaoguang REN. Survey on imbalanced multi‑class classification algorithms[J]. Journal of Computer Applications, 2022, 42(11): 3307-3321.

图/表 13

参考文献 100

1	SHILASKAR S， GHATOL A. Diagnosis system for imbalanced multi‑minority medical dataset［J］. Soft Computing， 2019， 23（13）： 4789-4799. 10.1007/s00500-018-3133-x
2	LANGO M. Tackling the problem of class imbalance in multi‑class sentiment classification： an experimental study［J］. Foundations of Computing and Decision Sciences， 2019， 44（2）： 151-178. 10.2478/fcds-2019-0009
3	KRAWCZYK B， McINNES B T， CANO A. Sentiment classification from multi‑class imbalanced twitter data using binarization［C］// Proceedings of the 2017 International Conference on Hybrid Artificial Intelligence Systems， LNCS 10334. Cham： Springer， 2017： 26-37.
4	KULKARNI R， VINTRÓ M， KAPETANAKIS S， et al. Performance comparison of popular text vectorising models on multi‑class email classification［C］// Proceedings of the 2018 SAI Intelligent Systems Conference， AISC 868. Cham： Springer， 2019： 567-578.
5	DORADO‑MORENO M， GUTIÉRREZ P A， CORNEJO‑BUENO L， et al. Ordinal multi‑class architecture for predicting wind power ramp events based on reservoir computing［J］. Neural Processing Letters， 2020， 52（1）： 57-74. 10.1007/s11063-018-9922-5
6	YUAN Y L， HUO L W， HOGREFE D. Two layers multi‑class detection method for network intrusion detection system［C］// Proceedings of the 2017 IEEE Symposium on Computers and Communications. Piscataway： IEEE， 2017： 767-772. 10.1109/iscc.2017.8024620
7	BENCHAJI I， DOUZI S， OUAHIDI B EL. Using genetic algorithm to improve classification of imbalanced datasets for credit card fraud detection［C］// Proceedings of the 2019 International Conference on Advanced Information Technology， Services and Systems， LNNS 66. Cham： Springer， 2019： 220-229.
8	李艳霞，柴毅，胡友强，等. 不平衡数据分类方法综述［J］. 控制与决策， 2019， 34（4）： 673-688. 10.13195/j.kzyjc.2018.0865
	LI Y X， CHAI Y， HU Y Q， et al. Review of imbalanced data classification methods［J］. Control and Decision， 2019， 34（4）： 673-688. 10.13195/j.kzyjc.2018.0865
9	SAHARE M， GUPTA H. A review of multi‑class classification for imbalanced data［J］. International Journal of Advanced Computer Research， 2012， 2（5）： 160-164.
10	TANHA J， ABDI Y， SAMADI N， et al. Boosting methods for multi‑class imbalanced data classification： an experimental review［J］. Journal of Big Data， 2020， 7： No.70. 10.1186/s40537-020-00349-y
11	KAUR H， PANNU H S， MALHI A K. A systematic review on imbalanced data challenges in machine learning［J］. ACM Computing Surveys， 2019， 52（4）： No.79. 10.1145/3343440
12	KRAWCZYK B， KOZIARSKI M， WOŹNIAK M. Radial‑based oversampling for multiclass imbalanced data classification［J］. IEEE Transactions on Neural Networks and Learning Systems， 2020， 31（8）： 2818-2831. 10.1109/tnnls.2019.2913673
13	ZHANG Z L， KRAWCZYK B， GARCÌA S， et al. Empowering one‑vs‑one decomposition with ensemble learning for multi‑class imbalanced data［J］. Knowledge‑Based Systems， 2016， 106： 251-263. 10.1016/j.knosys.2016.05.048
14	RODRÍGUEZ J J， DÍEZ‑PASTOR J F， ARNAIZ‑GONZÁLEZ Á， et al. Random balance ensembles for multiclass imbalance learning［J］. Knowledge‑Based Systems， 2020， 193： No.105434. 10.1016/j.knosys.2019.105434
15	ŻAK M， WOŹNIAK M. Performance analysis of binarization strategies for multi‑class imbalanced data classification［C］// Proceedings of the 2020 International Conference on Computational Science， LNCS 12140. Cham： Springer， 2020： 141-155.
16	ZHANG Z L， LUO X G， GONZÁLEZ S， et al. DRCW‑ASEG： One‑versus‑one distance‑based relative competence weighting with adaptive synthetic example generation for multi‑class imbalanced datasets［J］. Neurocomputing， 2018， 285： 176-187. 10.1016/j.neucom.2018.01.039
17	LIANG L J， JIN T T， HUO M Y. Feature identification from imbalanced data sets for diagnosis of cardiac arrhythmia［C］// Proceedings of the 11th International Symposium on Computational Intelligence and Design. Piscataway： IEEE， 2018： 52-55. 10.1109/iscid.2018.10113
18	CHAWLA N V， BOWYER K W， HALL L O， et al. SMOTE： synthetic minority over‑sampling technique［J］. Journal of Artificial Intelligence Research， 2002， 16： 321-357. 10.1613/jair.953
19	LIU X Y， WU J X， ZHOU Z H. Exploratory undersampling for class‑imbalance learning［J］. IEEE Transactions on Systems， Man， and Cybernetics， Part B （Cybernetics）， 2009， 39（2）： 539-550. 10.1109/tsmcb.2008.2007853
20	BARANDELA R， VALDOVINOS R M， SÁNCHEZ J S. New applications of ensembles of classifiers［J］. Pattern Analysis and Applications， 2003， 6（3）： 245-256. 10.1007/s10044-003-0192-z
21	WANG S， YAO X. Diversity analysis on imbalanced data sets by using ensemble models［C］// Proceedings of the 2009 IEEE Symposium on Computational Intelligence and Data Mining. Piscataway： IEEE， 2009： 324-331. 10.1109/cidm.2009.4938667
22	SEIFFERT C， KHOSHGOFTAAR T M， van HULSE J， et al. RUSBoost： a hybrid approach to alleviating class imbalance［J］. IEEE Transactions on Systems， Man， and Cybernetics — Part A： Systems and Humans， 2010， 40（1）： 185-197. 10.1109/tsmca.2009.2029559
23	CHAWLA N V， LAZAREVIC A， HALL L O， et al. SMOTEBoost： improving prediction of the minority class in boosting［C］// Proceedings of the 2003 European Conference on Principles of Data Mining and Knowledge Discovery， LNCS 2838. Berlin： Springer， 2003： 107-119.
24	JEGIERSKI H， SAGANOWSKI S. An “outside the box” solution for imbalanced data classification［J］. IEEE Access， 2020， 8： 125191-125209. 10.1109/access.2020.3007801
25	SEN A， ISLAM M M， MURASE K， et al. Binarization with boosting and oversampling for multiclass classification［J］. IEEE Transactions on Cybernetics， 2016， 46（5）： 1078-1091. 10.1109/tcyb.2015.2423295
26	JIANG C Q， LIU Y， DING Y， et al. Capturing helpful reviews from social media for product quality improvement： a multi‑class classification approach［J］. International Journal of Production Research， 2017， 55（12）： 3528-3541. 10.1080/00207543.2017.1304664
27	SÁEZ J A， GALAR M， LUENGO J， et al. Analyzing the presence of noise in multi‑class problems： alleviating its influence with the One‑vs‑One decomposition［J］. Knowledge and Information Systems， 2014， 38（1）： 179-206. 10.1007/s10115-012-0570-1
28	MURPHEY Y L， WANG H X， OU G B， et al. OAHO： an effective algorithm for multi‑class learning from imbalanced data［C］// Proceedings of the 2007 International Joint Conference on Neural Networks. Piscataway： IEEE， 2007： 406-411. 10.1109/ijcnn.2007.4370991
29	HAN H， WANG W Y， MAO B H. Borderline‑SMOTE： a new over‑sampling method in imbalanced data sets learning［C］// Proceedings of the 2005 International Conference on Intelligent Computing， LNCS 3644. Berlin： Springer， 2005： 878-887.
30	HE H B， BAI Y， GARCIA E A， et al. ADASYN： adaptive synthetic sampling approach for imbalanced learning［C］// Proceedings of the 2008 IEEE International Joint Conference on Neural Network （IEEE World Congress on Computational Intelligence）. Piscataway： IEEE， 2008： 1322-1328. 10.1109/ijcnn.2008.4633969
31	GALAR M， FERNÁNDEZ A， BARRENECHEA E， et al. DRCW‑OVO： distance‑based relative competence weighting combination for One‑vs‑One strategy in multi‑class problems［J］. Pattern Recognition， 2015， 48（1）： 28-42. 10.1016/j.patcog.2014.07.023
32	ZHANG J H， CUI X Q， LI J R， et al. Imbalanced classification of mental workload using a cost‑sensitive majority weighted minority oversampling strategy［J］. Cognition， Technology and Work， 2017， 19（4）： 633-653. 10.1007/s10111-017-0447-x
33	PATIL S S， SONAVANE S P. Enriched over_sampling techniques for improving classification of imbalanced big data［C］// Proceedings of the IEEE 3rd International Conference on Big Data Computing Service and Applications. Piscataway： IEEE， 2017： 1-10. 10.1109/bigdataservice.2017.19
34	RIVERA W， ASPAROUHOV O. Safe level OUPS for improving target concept learning in imbalanced data sets［C］// Proceedings of the 2015 IEEE SoutheastCon. Piscataway： IEEE， 2015： 1-8. 10.1109/secon.2015.7132940
35	MATHEW J， PANG C K， LUO M， et al. Classification of imbalanced data by oversampling in kernel space of support vector machines［J］. IEEE Transactions on Neural Networks and Learning Systems， 2018， 29（9）： 4065-4076. 10.1109/tnnls.2017.2751612
36	ZAREAPOOR M， SHAMSOLMOALI P， YANG J. Oversampling adversarial network for class‑imbalanced fault diagnosis［J］. Mechanical Systems and Signal Processing， 2021， 149： No.107175. 10.1016/j.ymssp.2020.107175
37	XIA M， LI T， XU L， et al. Fault diagnosis for rotating machinery using multiple sensors and convolutional neural networks［J］. IEEE‑ASME Transactions on Mechatronics， 2018， 23（1）： 101-110. 10.1109/tmech.2017.2728371
38	LIU H， ZHOU J Z， XU Y H， et al. Unsupervised fault diagnosis of rolling bearings using a deep neural network based on generative adversarial networks［J］. Neurocomputing， 2018， 315： 412-424. 10.1016/j.neucom.2018.07.034
39	YU H Y， CHEN C Y， YANG H M. Two‑stage game strategy for multiclass imbalanced data online prediction［J］. Neural Processing Letters， 2020， 52（3）： 2493-2512. 10.1007/s11063-020-10358-w
40	LEE J， PARK K. GAN‑based imbalanced data intrusion detection system［J］. Personal and Ubiquitous Computing， 2021， 25（1）： 121-128. 10.1007/s00779-019-01332-y
41	SHAMSOLMOALI P， ZAREAPOOR M， SHEN L L， et al. Imbalanced data learning by minority class augmentation using capsule adversarial networks［J］. Neurocomputing， 2020， 459： 481-493. 10.1016/j.neucom.2020.01.119
42	POUYANFAR S， CHEN S C， SHYU M L. Deep spatio‑temporal representation learning for multi‑class imbalanced data classification［C］// Proceedings of the 2018 IEEE International Conference on Information Reuse and Integration. Piscataway： IEEE， 2018： 386-393. 10.1109/iri.2018.00064
43	LIU Q J， MA G J， CHENG C. Data fusion generative adversarial network for multi‑class imbalanced fault diagnosis of rotating machinery［J］. IEEE Access， 2020， 8： 70111-70124. 10.1109/access.2020.2986356
44	YANG X B， KUANG Q M， ZHANG W S， et al. AMDO： an over‑sampling technique for multi‑class imbalanced problems［J］. IEEE Transactions on Knowledge and Data Engineering， 2018， 30（9）： 1672-1685. 10.1109/tkde.2017.2761347
45	ABDI L， HASHEMI S. To combat multi‑class imbalanced problems by means of over‑sampling techniques［J］. IEEE Transactions on Knowledge and Data Engineering， 2016， 28（1）： 238-251. 10.1109/tkde.2015.2458858
46	LI Q M， SONG Y J， ZHANG J， et al. Multiclass imbalanced learning with one‑versus‑one decomposition and spectral clustering［J］. Expert Systems with Applications， 2020， 147： No.113152. 10.1016/j.eswa.2019.113152
47	CHEN X T， ZHANG L， WEI X H， et al. An effective method using clustering‑based adaptive decomposition and editing‑based diversified oversamping for multi‑class imbalanced datasets［J］. Applied Intelligence， 2021， 51（4）： 1918-1933. 10.1007/s10489-020-01883-1
48	SANTOSO B， WIJAYANTO H， NOTODIPUTRO K A， et al. K‑Neighbor over‑sampling with cleaning data： a new approach to improve classification performance in data sets with class imbalance［J］. Applied Mathematical Sciences， 2018， 12（10）： 449-460. 10.12988/ams.2018.8231
49	KOZIARSKI M， WOŹNIAK M， KRAWCZYK B. Combined cleaning and resampling algorithm for multi‑class imbalanced data with label noise［J］. Knowledge‑Based Systems， 2020， 204： No.106223. 10.1016/j.knosys.2020.106223
50	WU Q， LIN Y P， ZHU T F， et al. HUSBoost： a hubness‑aware boosting for high‑dimensional imbalanced data classification［C］// Proceedings of the 2019 International Conference on Machine Learning and Data Engineering. Piscataway： IEEE， 2019： 36-41. 10.1109/icmlde49015.2019.00018
51	RAYHAN F， AHMED S， MAHBUB A， et al. CUSBoost： cluster‑ based under‑sampling with boosting for imbalanced classification［C］// Proceedings of the 2nd International Conference on Computational Systems and Information Technology for Sustainable Solution. Piscataway： IEEE， 2017： 1-5. 10.1109/csitss.2017.8447534
52	LI Y， WANG J， WANG S G， et al. Local dense mixed region cutting + global rebalancing： a method for imbalanced text sentiment classification［J］. International Journal of Machine Learning and Cybernetics， 2019， 10（7）： 1805-1820. 10.1007/s13042-018-0858-x
53	LI L S， HE H B， LI J. Entropy‑based sampling approaches for multi‑class imbalanced problems［J］. IEEE Transactions on Knowledge and Data Engineering， 2020， 32（11）： 2159-2170. 10.1109/tkde.2019.2913859
54	GALAR M， FERNÁNDEZ A， BARRENECHEA E， et al. EUSBoost： enhancing ensembles for highly imbalanced data‑sets by evolutionary undersampling［J］. Pattern Recognition， 2013， 46（12）： 3460-3471. 10.1016/j.patcog.2013.05.006
55	GARCÍA S， HERRERA F. Evolutionary undersampling for classification with imbalanced datasets： proposals and taxonomy［J］. Evolutionary Computation， 2009， 17（3）： 275-306. 10.1162/evco.2009.17.3.275
56	FERNANDES E R Q， DE CARVALHO A C P L F. Evolutionary inversion of class distribution in overlapping areas for multi‑class imbalanced learning［J］. Information Sciences， 2019， 494： 141-154. 10.1016/j.ins.2019.04.052
57	DEB K， PRATAP A， AGARWAL S， et al. A fast and elitist multiobjective genetic algorithm： NSGA‑Ⅱ［J］. IEEE Transactions on Evolutionary Computation， 2002， 6（2）： 182-197. 10.1109/4235.996017
58	GOLDBERG D E. Genetic Algorithms in Search， Optimization， and Machine Learning［M］. Boston： Addison‑Wesley Professional， 1989： 95-99. 10.5860/choice.27-0936
59	LIU Z， TANG D Y， CAI Y M， et al. A hybrid method based on ensemble WELM for handling multi class imbalance in cancer microarray data［J］. Neurocomputing， 2017， 266： 641-650. 10.1016/j.neucom.2017.05.066
60	SARIKAYA A， KILIÇ B G. A class‑specific intrusion detection model： hierarchical multi‑class IDS model［J］. SN Computer Science， 2020， 1（4）： No.202. 10.1007/s42979-020-00213-z
61	LI J T， WANG Y Y， SONG X K， et al. Adaptive multinomial regression with overlapping groups for multi‑class classification of lung cancer［J］. Computers in Biology and Medicine， 2018， 100： 1-9. 10.1016/j.compbiomed.2018.06.014
62	DUFRENOIS F. A one‑class kernel fisher criterion for outlier detection［J］. IEEE Transactions on Neural Networks and Learning Systems， 2015， 26（5）： 982-994. 10.1109/tnnls.2014.2329534
63	BELLINGER C， SHARMA S， JAPKOWICZ N. One‑class versus binary classification： which and when？［C］// Proceedings of the 11th International Conference on Machine Learning and Applications. Piscataway： IEEE， 2012： 102-106. 10.1109/icmla.2012.212
64	HEMPSTALK K， FRANK E. Discriminating against new classes： one‑class versus multi‑class classification［C］// Proceedings of the 2008 Australasian Joint Conference on Artificial Intelligence， LNCS 5360. Berlin： Springer， 2008： 325-336.
65	KRAWCZYK B， WOŹNIAK M， HERRERA F. On the usefulness of one‑class classifier ensembles for decomposition of multi‑class problems［J］. Pattern Recognition， 2015， 48（12）： 3969-3982. 10.1016/j.patcog.2015.06.001
66	PÉREZ‑SÁNCHEZ B， FONTENLA‑ROMERO O， SÁNCHEZ‑ MAROÑO N. Selecting target concept in one‑class classification for handling class imbalance problem［C］// Proceedings of the 2015 International Joint Conference on Neural Networks. Piscataway： IEEE， 2015： 1-8. 10.1109/ijcnn.2015.7280661
67	KRAWCZYK B， GALAR M， WOŹNIAK M， et al. Dynamic ensemble selection for multi‑class classification with one‑class classifiers［J］. Pattern Recognition， 2018， 83： 34-51. 10.1016/j.patcog.2018.05.015
68	GAO L， ZHANG L， LIU C， et al. Handling imbalanced medical image data： a deep‑learning‑based one‑class classification approach［J］. Artificial Intelligence in Medicine， 2020， 108： No.101935. 10.1016/j.artmed.2020.101935
69	万建武，杨明. 代价敏感学习方法综述［J］. 软件学报， 2020， 31（1）： 113-136. 10.13328/j.cnki.jos.005871
	WAN J W， YANG M. Survey on cost‑sensitive learning method［J］. Journal of Software， 2020， 31（1）： 113-136. 10.13328/j.cnki.jos.005871
70	ZHANG Z L， LUO X G， GARCÍA S， et al. Cost‑sensitive back‑ propagation neural networks with binarization techniques in addressing multi‑class problems and non‑competent classifiers［J］. Applied Soft Computing， 2017， 56： 357-367. 10.1016/j.asoc.2017.03.016
71	LING C X， SHENG V S. Cost‑sensitive learning and the class imbalance problem［M］// Encyclopedia of Machine Learning. Boston： Springer， 2010： 171， 231-235. 10.4018/978-1-60566-010-3.ch054
72	DOMINGOS P. MetaCost： a general method for making classifiers cost‑sensitive［C］// Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York： ACM， 1999： 155-164. 10.1145/312129.312220
73	IRANMEHR A， MASNADI‑SHIRAZI H， VASCONCELOS N. Cost‑sensitive support vector machines［J］. Neurocomputing， 2019， 343： 50-64. 10.1016/j.neucom.2018.11.099
74	GU B， SHENG V S， TAY K Y， et al. Cross validation through two‑dimensional solution surface for cost‑sensitive SVM［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2017， 39（6）： 1103-1121. 10.1109/tpami.2016.2578326
75	ZHANG C， TAN K C， LI H Z， et al. A cost‑sensitive deep belief network for imbalanced classification［J］. IEEE Transactions on Neural Networks and Learning Systems， 2019， 30（1）： 109-122. 10.1109/tnnls.2018.2832648
76	LANGO M， STEFANOWSKI J. Multi‑class and feature selection extensions of roughly balanced bagging for imbalanced data［J］. Journal of Intelligent Information Systems， 2018， 50（1）： 97-127. 10.1007/s10844-017-0446-7
77	HIDO S， KASHIMA H， TAKAHASHI Y. Roughly balanced bagging for imbalanced data［J］. Statistical Analysis and Data Mining， 2009， 2（5/6）： 412-426. 10.1002/sam.10061
78	TAHERKHANI A， COSMA G， McGINNITY T M. AdaBoost‑ CNN： an adaptive boosting algorithm for convolutional neural networks to classify multi‑class imbalanced datasets using transfer learning［J］. Neurocomputing， 2020， 404： 351-366. 10.1016/j.neucom.2020.03.064
79	DÍEZ‑PASTOR J F， RODRÍGUEZ J J， GARCÍA‑OSORIO C， et al. Random Balance： ensembles of variable priors classifiers for imbalanced data［J］. Knowledge‑Based Systems， 2015， 85： 96-111. 10.1016/j.knosys.2015.04.022
80	FERNÁNDEZ‑BALDERA A， BUENAPOSADA J M， BAUMELA L. BAdaCost： multi‑class Boosting with costs［J］. Pattern Recognition， 2018， 79： 467-479. 10.1016/j.patcog.2018.02.022
81	SCHWENKER F. Ensemble methods： foundations and algorithms ［Book Review］［J］. IEEE Computational Intelligence Magazine， 2013， 8（1）： 77-79. 10.1109/mci.2012.2228600
82	JOHNSON J M， KHOSHGOFTAAR T M. Survey on deep learning with class imbalance［J］. Journal of Big Data， 2019， 6： No.27. 10.1186/s40537-019-0192-5
83	RENDÓN E， ALEJO R， CASTORENA C， et al. Data sampling methods to deal with the big data multi‑class imbalance problem［J］. Applied Sciences， 2020， 10（4）： No.1276. 10.3390/app10041276
84	WILSON D L. Asymptotic properties of nearest neighbor rules using edited data［J］. IEEE Transactions on Systems， Man and Cybernetics， 1972， SMC‑2（3）： 408-421. 10.1109/tsmc.1972.4309137
85	TOMEK I. Two modifications of CNN［J］. IEEE Transactions on Systems， Man and Cybernetics， 1976， SMC‑6（11）： 769-772. 10.1109/tsmc.1976.4309452
86	RAGHUWANSHI B S， SHUKLA S. Generalized class‑specific kernelized extreme learning machine for multiclass imbalanced learning［J］. Expert Systems with Applications， 2019， 121： 244-255. 10.1016/j.eswa.2018.12.024
87	RAGHUWANSHI B S， SHUKLA S. Class‑specific kernelized extreme learning machine for binary class imbalance learning［J］. Applied Soft Computing， 2018， 73： 1026-1038. 10.1016/j.asoc.2018.10.011
88	MOSLEY L S D. A balanced approach to the multi‑class imbalance problem［D］. Ames， IA： Iowa State University， 2013： 15-25.
89	SOKOLOVA M， LAPALME G. A systematic analysis of performance measures for classification tasks［J］. Information Processing and Management， 2009， 45（4）： 427-437. 10.1016/j.ipm.2009.03.002
90	MORTAZ E. Imbalance accuracy metric for model selection in multi‑class imbalance classification problems［J］. Knowledge‑ Based Systems， 2020， 210： No.106490. 10.1016/j.knosys.2020.106490
91	VIERA A J， GARRETT J M. Understanding interobserver agreement： the kappa statistic［J］. Family Medicine， 2005， 37（5）： 360-363.
92	WEI J M， YUAN X J， HU Q H， et al. A novel measure for evaluating classifiers［J］. Expert Systems with Applications， 2010， 37（5）： 3799-3809. 10.1016/j.eswa.2009.11.040
93	BRANCO P， TORGO L， RIBEIRO R P. Relevance‑based evaluation metrics for multi‑class imbalanced domains［C］// Proceedings of the 2017 Pacific‑Asia Conference on Knowledge Discovery and Data Mining， LNCS 10234. Cham： Springer， 2017： 698-710.
94	GORODKIN J. Comparing two K‑category assignments by a K‑category correlation coefficient［J］. Computational Biology and Chemistry， 2004， 28（5/6）： 367-374. 10.1016/j.compbiolchem.2004.09.006
95	MATTHEWS B W. Comparison of the predicted and observed secondary structure of T4 phage lysozyme［J］. Biochimica et Biophysica Acta （BBA） — Protein Structure， 1975， 405（2）： 442-451. 10.1016/0005-2795(75)90109-9
96	GARCÍA‑PEDRAJAS N， ORTIZ‑BOYER D. Improving multiclass pattern recognition by the combination of two strategies［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2006， 28（6）： 1001-1006. 10.1109/tpami.2006.123
97	FERNÁNDEZ A， LÓPEZ V， GALAR M， et al. Analysing the classification of imbalanced data‑sets with multiple classes： Binarization techniques and ad‑hoc approaches［J］. Knowledge‑ Based Systems， 2013， 42： 97-110. 10.1016/j.knosys.2013.01.018
98	RAMENTOL E， VLUYMANS S， VERBIEST N， et al. IFROWANN： imbalanced fuzzy‑rough ordered weighted average nearest neighbor classification［J］. IEEE Transactions on Fuzzy Systems， 2015， 23（5）： 1622-1637. 10.1109/tfuzz.2014.2371472
99	BI J J， ZHANG C S. An empirical comparison on state‑of‑the‑art multi‑class imbalance learning algorithms and a new diversified ensemble learning scheme［J］. Knowledge‑Based Systems， 2018， 158： 81-93. 10.1016/j.knosys.2018.05.037
100	KANG S， CHO S， KANG P. Constructing a multi‑class classifier using one‑against‑one approach with different binary classifiers［J］. Neurocomputing， 2015， 149（Pt B）： 677-682. 10.1016/j.neucom.2014.08.006

序号	数据集	样本数量	特征数量	类别数目	各类别实例数目	不平衡率/%	应用背景
1	contraceptive	1 473	9	3	629/511/333	1.89	避孕方法
2	balance	625	4	3	288/288/49	5.88	样本规模不平衡
3	newthyroid	215	5	3	150/35/30	5.00	甲状腺疾病（新版）
4	splice	3 190	60	3	1 655/768/767	2.16	文本拼接
5	thyroid	7 200	21	3	6 666/368/166	40.16	甲状腺疾病
6	wine	178	13	3	71/59/48	1.48	酒类
7	car	1 728	6	4	1 210/384/69/65	18.62	车类
8	page_blocks	5 472	10	5	4 913/329/115/87/28	175.46	文本页面
9	flare	1 066	11	6	331/239/211/147/95/43	7.70	太阳耀斑
10	satimage	6 435	36	6	1 533/1 508/1 358/707/703/626	2.45	文本图像

序号	数据集	样本数量	特征数量	类别数目	各类别实例数目	不平衡率/%	应用背景
1	contraceptive	1 473	9	3	629/511/333	1.89	避孕方法
2	balance	625	4	3	288/288/49	5.88	样本规模不平衡
3	newthyroid	215	5	3	150/35/30	5.00	甲状腺疾病（新版）
4	splice	3 190	60	3	1 655/768/767	2.16	文本拼接
5	thyroid	7 200	21	3	6 666/368/166	40.16	甲状腺疾病
6	wine	178	13	3	71/59/48	1.48	酒类
7	car	1 728	6	4	1 210/384/69/65	18.62	车类
8	page_blocks	5 472	10	5	4 913/329/115/87/28	175.46	文本页面
9	flare	1 066	11	6	331/239/211/147/95/43	7.70	太阳耀斑
10	satimage	6 435	36	6	1 533/1 508/1 358/707/703/626	2.45	文本图像

数据集	分解方法					数据级方法		算法级方法	代价敏感方法	集成方法
数据集	A&O	OVO	OVA	DOVO	DECOC	OVO‑SMOTE	OVA‑SMOTE	OAHO	FuzzyImb	BBO
平均值	0.832 5	0.866 5	0.826 7	0.940 5	0.920 5	0.820 8	0.818 0	0.892 8	0.874 9	0.875 8
contraceptive	0.746 1	0.755 6	0.736 6	0.675 5	0.663 7	0.575 1	0.508 1	0.756 3	0.726 1	0.676 2
balance	0.617 6	0.606 4	0.763 2	0.931 3	0.927 7	0.828 7	0.828 9	0.720 0	0.807 2	0.900 8
newthyroid	0.976 7	0.976 7	0.981 4	0.983 9	0.990 5	0.956 9	0.937 8	0.981 4	0.953 5	0.948 8
splice	0.969 3	0.970 2	0.974 0	0.974 5	0.930 6	0.477 6	0.476 6	0.975 9	0.940 4	0.667 9
thyroid	0.994 9	0.994 9	0.995 0	0.998 5	0.997 9	0.928 4	0.938 3	0.993 8	0.936 7	0.959 8
wine	0.977 5	0.977 5	0.679 8	0.984 8	0.975 8	0.795 9	0.752 8	0.977 5	0.831 5	0.799 6
car	0.648 1	0.703 1	0.915 5	0.987 1	0.876 8	0.952 8	0.921 6	0.923 6	0.909 4	0.950 1
page_blocks	0.930 4	0.942 3	0.949 2	0.985 0	0.973 2	0.837 8	0.977 5	0.942 8	0.907 4	0.982 5
flare	0.534 7	0.789 9	0.508 4	0.892 6	0.904 7	0.892 1	0.888 8	0.739 2	0.778 4	0.901 8
satimage	0.929 4	0.948 3	0.763 6	0.991 8	0.963 7	0.962 4	0.950 0	0.917 9	0.958 3	0.970 8

数据集	分解方法					数据级方法		算法级方法	代价敏感方法	集成方法
数据集	A&O	OVO	OVA	DOVO	DECOC	OVO‑SMOTE	OVA‑SMOTE	OAHO	FuzzyImb	BBO
平均值	0.832 5	0.866 5	0.826 7	0.940 5	0.920 5	0.820 8	0.818 0	0.892 8	0.874 9	0.875 8
contraceptive	0.746 1	0.755 6	0.736 6	0.675 5	0.663 7	0.575 1	0.508 1	0.756 3	0.726 1	0.676 2
balance	0.617 6	0.606 4	0.763 2	0.931 3	0.927 7	0.828 7	0.828 9	0.720 0	0.807 2	0.900 8
newthyroid	0.976 7	0.976 7	0.981 4	0.983 9	0.990 5	0.956 9	0.937 8	0.981 4	0.953 5	0.948 8
splice	0.969 3	0.970 2	0.974 0	0.974 5	0.930 6	0.477 6	0.476 6	0.975 9	0.940 4	0.667 9
thyroid	0.994 9	0.994 9	0.995 0	0.998 5	0.997 9	0.928 4	0.938 3	0.993 8	0.936 7	0.959 8
wine	0.977 5	0.977 5	0.679 8	0.984 8	0.975 8	0.795 9	0.752 8	0.977 5	0.831 5	0.799 6
car	0.648 1	0.703 1	0.915 5	0.987 1	0.876 8	0.952 8	0.921 6	0.923 6	0.909 4	0.950 1
page_blocks	0.930 4	0.942 3	0.949 2	0.985 0	0.973 2	0.837 8	0.977 5	0.942 8	0.907 4	0.982 5
flare	0.534 7	0.789 9	0.508 4	0.892 6	0.904 7	0.892 1	0.888 8	0.739 2	0.778 4	0.901 8
satimage	0.929 4	0.948 3	0.763 6	0.991 8	0.963 7	0.962 4	0.950 0	0.917 9	0.958 3	0.970 8

数据集	分解方法					数据级方法		算法级方法	代价敏感方法	集成方法
数据集	A&O	OVO	OVA	DOVO	DECOC	OVO‑SMOTE	OVA‑SMOTE	OAHO	FuzzyImb	BBO
平均值	0.802 1	0.824 0	0.772 1	0.917 5	0.902 7	0.810 7	0.695 9	0.824 1	0.764 5	0.638 1
contraceptive	0.730 7	0.738 1	0.720 1	0.638 0	0.632 0	0.665 4	0.524 8	0.742 7	0.722 9	0.455 2
balance	0.580 1	0.564 6	0.677 1	0.879 1	NaN	0.740 0	NaN	0.637 7	0.502 6	NaN
newthyroid	0.969 2	0.969 2	0.974 9	0.950 2	0.974 6	0.904 7	0.892 3	0.974 9	0.941 4	0.864 1
splice	0.967 2	0.968 0	0.971 6	0.965 7	0.905 3	0.587 3	0.541 3	0.973 3	0.929 2	0.613 7
thyroid	0.971 3	0.971 3	0.972 2	0.999 2	0.998 9	0.956 0	0.651 4	0.969 1	0.940 8	0.598 0
wine	0.978 0	0.978 0	0.538 1	0.981 7	0.972 9	0.809 1	0.712 6	0.978 0	0.831 8	0.659 6
car	0.570 8	0.602 9	0.892 5	0.968 0	NaN	0.788 0	0.700 9	0.915 7	0.602 6	NaN
page_blocks	0.728 7	0.781 6	0.601 7	NaN	0.872 9	NaN	0.743 5	0.466 7	0.568 8	NaN
flare	0.610 1	0.723 7	0.641 0	0.885 4	NaN	0.882 7	0.637 1	0.661 3	0.694 7	NaN
satimage	0.914 9	0.942 1	0.731 3	0.990 1	0.962 4	0.962 7	0.859 3	0.921 8	0.910 1	NaN

不平衡多分类算法综述

Survey on imbalanced multi‑class classification algorithms

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 13

参考文献 100

相关文章 15

编辑推荐

Metrics

[1]	姚梓豪, 栗远明, 马自强, 李扬, 魏良根. 基于机器学习的多目标缓存侧信道攻击检测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1862-1871.
[2]	陈学斌, 任志强, 张宏扬. 联邦学习中的安全威胁与防御措施综述[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1663-1672.
[3]	佘维, 李阳, 钟李红, 孔德锋, 田钊. 基于改进实数编码遗传算法的神经网络超参数优化[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 671-676.
[4]	郑毅, 廖存燚, 张天倩, 王骥, 刘守印. 面向城区的基于图去噪的小区级RSRP估计方法[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 855-862.
[5]	李博, 黄建强, 黄东强, 王晓英. 基于异构平台的稀疏矩阵向量乘自适应计算优化[J]. 《计算机应用》唯一官方网站, 2024, 44(12): 3867-3875.
[6]	陈学斌, 屈昌盛. 面向联邦学习的后门攻击与防御综述[J]. 《计算机应用》唯一官方网站, 2024, 44(11): 3459-3469.
[7]	孙仁科, 皇甫志宇, 陈虎, 李仲年, 许新征. 神经架构搜索综述[J]. 《计算机应用》唯一官方网站, 2024, 44(10): 2983-2994.
[8]	柴汶泽, 范菁, 孙书魁, 梁一鸣, 刘竟锋. 深度度量学习综述[J]. 《计算机应用》唯一官方网站, 2024, 44(10): 2995-3010.
[9]	尹春勇, 周永成. 双端聚类的自动调整聚类联邦学习[J]. 《计算机应用》唯一官方网站, 2024, 44(10): 3011-3020.
[10]	崔昊阳, 张晖, 周雷, 杨春明, 李波, 赵旭剑. 有序规范实数对多相似度K最近邻分类算法[J]. 《计算机应用》唯一官方网站, 2023, 43(9): 2673-2678.
[11]	钟静, 林晨, 盛志伟, 张仕斌. 基于汉明距离的量子K-Means算法[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2493-2498.
[12]	蓝梦婕, 蔡剑平, 孙岚. 非独立同分布数据下的自正则化联邦学习优化方法[J]. 《计算机应用》唯一官方网站, 2023, 43(7): 2073-2081.
[13]	黄晓辉, 杨凯铭, 凌嘉壕. 基于共享注意力的多智能体强化学习订单派送[J]. 《计算机应用》唯一官方网站, 2023, 43(5): 1620-1624.
[14]	郝劭辰, 卫孜钻, 马垚, 于丹, 陈永乐. 基于高效联邦学习算法的网络入侵检测模型[J]. 《计算机应用》唯一官方网站, 2023, 43(4): 1169-1175.
[15]	孙晓飞, 朱静远, 陈斌, 游恒志. 融合多模态数据的药物合成反应的虚拟筛选[J]. 《计算机应用》唯一官方网站, 2023, 43(2): 622-629.

数据集	分解方法					数据级方法		算法级方法	代价敏感方法	集成方法
数据集	A&O	OVO	OVA	DOVO	DECOC	OVO‑SMOTE	OVA‑SMOTE	OAHO	FuzzyImb	BBO
平均值	0.797 9	0.807 6	0.781 9	0.916 5	0.856 2	0.731 4	0.779 7	0.813 4	0.744 7	0.674 2
contraceptive	0.615 0	0.614 1	0.596 1	0.635 5	0.626 1	0.427 1	0.510 3	0.638 7	0.607 2	0.547 1
balance	0.551 3	0.514 7	0.653 3	0.920 3	0.928 0	0.529 4	0.724 2	0.592 1	0.465 2	0.598 6
newthyroid	0.983 2	0.983 2	0.986 6	0.968 2	0.985 4	0.949 9	0.890 0	0.986 6	0.966 1	0.851 6
splice	0.957 7	0.958 9	0.954 8	0.972 7	0.929 3	0.891 7	0.859 1	0.960 5	0.904 5	0.712 8
thyroid	0.994 7	0.994 7	0.994 7	0.990 7	0.990 0	0.467 1	0.606 5	0.994 1	0.930 9	0.514 7
wine	0.965 7	0.965 7	0.925 4	0.984 9	0.975 9	0.790 8	0.779 1	0.971 4	0.759 9	0.738 7
car	0.563 5	0.606 8	0.860 2	0.977 1	0.447 6	0.831 3	0.810 5	0.889 5	0.642 0	0.702 0
page_blocks	0.895 4	0.893 8	0.892 9	0.940 0	0.918 1	0.682 1	0.854 0	0.637 2	0.811 4	0.762 2
flare	0.673 4	0.682 4	0.605 8	0.784 3	0.804 1	0.789 1	0.803 5	0.696 6	0.721 9	0.640 8
satimage	0.779 0	0.861 9	0.349 5	0.991 0	0.957 1	0.955 7	0.959 9	0.767 2	0.637 9	0.673 9

数据集	分解方法					数据级方法		算法级方法	代价敏感方法	集成方法
数据集	A&O	OVO	OVA	DOVO	DECOC	OVO‑SMOTE	OVA‑SMOTE	OAHO	FuzzyImb	BBO
平均值	0.797 9	0.807 6	0.781 9	0.916 5	0.856 2	0.731 4	0.779 7	0.813 4	0.744 7	0.674 2
contraceptive	0.615 0	0.614 1	0.596 1	0.635 5	0.626 1	0.427 1	0.510 3	0.638 7	0.607 2	0.547 1
balance	0.551 3	0.514 7	0.653 3	0.920 3	0.928 0	0.529 4	0.724 2	0.592 1	0.465 2	0.598 6
newthyroid	0.983 2	0.983 2	0.986 6	0.968 2	0.985 4	0.949 9	0.890 0	0.986 6	0.966 1	0.851 6
splice	0.957 7	0.958 9	0.954 8	0.972 7	0.929 3	0.891 7	0.859 1	0.960 5	0.904 5	0.712 8
thyroid	0.994 7	0.994 7	0.994 7	0.990 7	0.990 0	0.467 1	0.606 5	0.994 1	0.930 9	0.514 7
wine	0.965 7	0.965 7	0.925 4	0.984 9	0.975 9	0.790 8	0.779 1	0.971 4	0.759 9	0.738 7
car	0.563 5	0.606 8	0.860 2	0.977 1	0.447 6	0.831 3	0.810 5	0.889 5	0.642 0	0.702 0
page_blocks	0.895 4	0.893 8	0.892 9	0.940 0	0.918 1	0.682 1	0.854 0	0.637 2	0.811 4	0.762 2
flare	0.673 4	0.682 4	0.605 8	0.784 3	0.804 1	0.789 1	0.803 5	0.696 6	0.721 9	0.640 8
satimage	0.779 0	0.861 9	0.349 5	0.991 0	0.957 1	0.955 7	0.959 9	0.767 2	0.637 9	0.673 9

数据集	分解方法					数据级方法		算法级方法	代价敏感方法	集成方法
数据集	A&O	OVO	OVA	DOVO	DECOC	OVO‑SMOTE	OVA‑SMOTE	OAHO	FuzzyImb	BBO
平均值	0.734 8	0.778 8	0.724 9	0.842 5	0.763 3	0.540 8	0.518 9	0.818 3	0.425 3	0.493 6
contraceptive	0.604 6	0.616 0	0.588 1	0.307 9	0.282 2	0.145 9	0.125 9	0.623 7	0.563 2	0.211 2
balance	0.457 6	0.437 8	0.630 8	0.769 9	0.771 9	0.356 8	0.510 7	0.573 2	0.347 4	0.521 8
newthyroid	0.951 6	0.951 6	0.961 0	0.940 5	0.968 7	0.872 2	0.799 4	0.961 0	0.905 6	0.805 7
splice	0.950 3	0.951 9	0.957 6	0.944 2	0.848 6	0.151 0	0.147 7	0.960 8	0.536 1	0.388 8
thyroid	0.964 3	0.964 3	0.965 3	0.975 7	0.970 5	0.388 3	0.391 0	0.956 9	0.022 0	0.398 4
wine	0.965 8	0.965 8	0.489 2	0.967 7	0.949 2	0.607 1	0.515 0	0.966 0	0.749 5	0.516 9
car	0.407 5	0.477 7	0.825 9	0.957 8	0.349 5	0.666 0	0.621 1	0.842 5	-0.063 6	0.614 1
page_blocks	0.716 3	0.752 0	0.764 1	0.891 1	0.850 9	0.610 3	0.675 3	0.735 1	0.604 3	0.674 8
flare	0.416 3	0.734 3	0.359 2	0.687 3	0.724 2	0.693 7	0.572 0	0.665 3	0.386 1	0.477 2
satimage	0.913 2	0.936 4	0.707 3	0.982 6	0.917 6	0.916 7	0.830 4	0.898 5	0.202 3	0.327 1

数据集	分解方法					数据级方法		算法级方法	代价敏感方法	集成方法
数据集	A&O	OVO	OVA	DOVO	DECOC	OVO‑SMOTE	OVA‑SMOTE	OAHO	FuzzyImb	BBO
平均值	0.734 8	0.778 8	0.724 9	0.842 5	0.763 3	0.540 8	0.518 9	0.818 3	0.425 3	0.493 6
contraceptive	0.604 6	0.616 0	0.588 1	0.307 9	0.282 2	0.145 9	0.125 9	0.623 7	0.563 2	0.211 2
balance	0.457 6	0.437 8	0.630 8	0.769 9	0.771 9	0.356 8	0.510 7	0.573 2	0.347 4	0.521 8
newthyroid	0.951 6	0.951 6	0.961 0	0.940 5	0.968 7	0.872 2	0.799 4	0.961 0	0.905 6	0.805 7
splice	0.950 3	0.951 9	0.957 6	0.944 2	0.848 6	0.151 0	0.147 7	0.960 8	0.536 1	0.388 8
thyroid	0.964 3	0.964 3	0.965 3	0.975 7	0.970 5	0.388 3	0.391 0	0.956 9	0.022 0	0.398 4
wine	0.965 8	0.965 8	0.489 2	0.967 7	0.949 2	0.607 1	0.515 0	0.966 0	0.749 5	0.516 9
car	0.407 5	0.477 7	0.825 9	0.957 8	0.349 5	0.666 0	0.621 1	0.842 5	-0.063 6	0.614 1
page_blocks	0.716 3	0.752 0	0.764 1	0.891 1	0.850 9	0.610 3	0.675 3	0.735 1	0.604 3	0.674 8
flare	0.416 3	0.734 3	0.359 2	0.687 3	0.724 2	0.693 7	0.572 0	0.665 3	0.386 1	0.477 2
satimage	0.913 2	0.936 4	0.707 3	0.982 6	0.917 6	0.916 7	0.830 4	0.898 5	0.202 3	0.327 1