Journal of Computer Applications ›› 2022, Vol. 42 ›› Issue (9): 2701-2712.DOI: 10.11772/j.issn.1001-9081.2021081371
• Data science and technology • Previous Articles Next Articles
Yan LI1,2,3, Bin FAN1,2(), Jie GUO1,2
Received:
2021-08-02
Revised:
2021-11-09
Accepted:
2021-11-20
Online:
2022-01-07
Published:
2022-09-10
Contact:
Bin FAN
About author:
LI Yan, born in 1976, Ph. D., professor. Her research interests include machine learning, uncertain information processing.Supported by:
通讯作者:
范斌
作者简介:
李艳(1976—),女,河北衡水人,教授,博士,CCF会员,主要研究方向:机器学习、不确定性信息处理;基金资助:
CLC Number:
Yan LI, Bin FAN, Jie GUO. Attribute reduction algorithm based on cluster granulation and divergence among clusters[J]. Journal of Computer Applications, 2022, 42(9): 2701-2712.
李艳, 范斌, 郭劼. 基于聚类粒化和簇间散度的属性约简算法[J]. 《计算机应用》唯一官方网站, 2022, 42(9): 2701-2712.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2021081371
数据集 | 样例数 | 条件属性数 | 类别数 |
---|---|---|---|
iris | 150 | 4 | 3 |
wine | 178 | 13 | 3 |
glass | 214 | 9 | 6 |
forest | 517 | 13 | 2 |
breast | 699 | 9 | 2 |
QSAR | 1 055 | 41 | 2 |
banknote | 1 372 | 4 | 2 |
wireless | 2 000 | 7 | 4 |
wine quality | 4 898 | 11 | 7 |
electrical | 10 000 | 13 | 2 |
iono | 351 | 34 | 2 |
sonar | 208 | 60 | 2 |
libras | 360 | 90 | 15 |
DLBCL | 77 | 7 129 | 2 |
Tab. 1 Experimental datasets
数据集 | 样例数 | 条件属性数 | 类别数 |
---|---|---|---|
iris | 150 | 4 | 3 |
wine | 178 | 13 | 3 |
glass | 214 | 9 | 6 |
forest | 517 | 13 | 2 |
breast | 699 | 9 | 2 |
QSAR | 1 055 | 41 | 2 |
banknote | 1 372 | 4 | 2 |
wireless | 2 000 | 7 | 4 |
wine quality | 4 898 | 11 | 7 |
electrical | 10 000 | 13 | 2 |
iono | 351 | 34 | 2 |
sonar | 208 | 60 | 2 |
libras | 360 | 90 | 15 |
DLBCL | 77 | 7 129 | 2 |
数据集 | 约简结果 | 聚类参数k |
---|---|---|
iris | {1,3,4} | 7 |
wine | {1,7,10,12} | 12 |
glass | {3,4,6,7} | 10 |
forest | {11} | 8 |
breast | {1,2,6,8} | 11 |
QSAR | {1,14,15,27,36,39} | 17 |
banknote | {1,2,3} | 9 |
wireless | {1,4,5,6} | 6 |
wine quality | {8,9,10,11} | 7 |
electrical | {13} | 3 |
iono | {5,11,12,13,27} | 7 |
sonar | {11,15,16,17,18,19,20,21,26,35, 36,37,45} | 11 |
libras | {1,3,5,7,9,11,12,13,15,17,19,21,23,25,27,30,49,53,55,71} | 11 |
DLBCL | {265,276,545,942,1 320,1 701,1 930,1 951,2 094,2 280,2 384,2 548,2 574,2 648,2 668,2 736,3 531,3 603,3 656,3 845,3 968,4 183,4 315,4 371,4 838,5 067,5 137,5 729,6 451,6 625,6 915,6 974} | 7 |
Tab. 2 ARCD attribute reduction results
数据集 | 约简结果 | 聚类参数k |
---|---|---|
iris | {1,3,4} | 7 |
wine | {1,7,10,12} | 12 |
glass | {3,4,6,7} | 10 |
forest | {11} | 8 |
breast | {1,2,6,8} | 11 |
QSAR | {1,14,15,27,36,39} | 17 |
banknote | {1,2,3} | 9 |
wireless | {1,4,5,6} | 6 |
wine quality | {8,9,10,11} | 7 |
electrical | {13} | 3 |
iono | {5,11,12,13,27} | 7 |
sonar | {11,15,16,17,18,19,20,21,26,35, 36,37,45} | 11 |
libras | {1,3,5,7,9,11,12,13,15,17,19,21,23,25,27,30,49,53,55,71} | 11 |
DLBCL | {265,276,545,942,1 320,1 701,1 930,1 951,2 094,2 280,2 384,2 548,2 574,2 648,2 668,2 736,3 531,3 603,3 656,3 845,3 968,4 183,4 315,4 371,4 838,5 067,5 137,5 729,6 451,6 625,6 915,6 974} | 7 |
数据集 | DRS | NRS | NRMIE | HANDI |
---|---|---|---|---|
iris | {1,2,3,4} | {1,3,4}(0.01) | {1,3}(0.25) | {3,4}(0.05) |
wine | {1,2,3,4,5,6,7,8,9} | {1,7,10}(0.01) | {6,7,9,12 }(0.3) | {7,10}(0.01) |
glass | {2,3,4,5,6,7,8,9} | {2,3,4,5,6,7,8,9}(0.2) | {1,2,4,5,7,8}(0.6) | {2,3,4,7,9}(0.25) |
forest | {1,2,4,5,6,7,8,9} | {1,4,5,6,8,10,11,12,13}(0.15) | {4,9,10,12,13}(0.5) | {1,2,4,5,8,12,13}(0.2) |
breast | {1,2,3,4,5,6,7,8,9} | {1,2,6}(0.3) | {2,3,4,5,8,9}(0.45) | {1,2,6}(0.25) |
QSAR | {1,2,3,4,8,9} | {1,3,7,8,10,13,14,20,27,31,35,37,38}(0.1) | {5,6,11,18,21,23,26,34,38,41}(0.1) | {1,7,8,10,12,14,22,25,27,30,31,33,35,37,38}(0.15) |
banknote | {1,2,3,4} | {1,2,3}(0.01) | {2,3}(0.3) | {1,2,3}(0.01) |
wireless | {1,2,3,4,5,6,7} | {1,5,6}(0.01) | {1,3,4}(0.5) | {1,4,5,7}(0.05) |
wine quality | {1,2,3,4,5,6,7,8,9} | {3,6,8,11}(0.45) | {1,2,4,6,8}(0.6) | {1,2,3,4,5,6,7,9,10,11}(0.1) |
electrical | {1,5,6,7,8,9} | {4,11,13}(0.01) | {3,5,6,8,9,11}(0.6) | {13} (0.01) |
iono | {4,6,8,9,10,11,12,14,16,17,18,19,20,22,23,24,27,29,30,31,32,33,34} | {1,3,5,7,8,11,13,15,19,27}(0.5) | {4,6,8,9,16,20,27,28,31,33,34}(0.05) | {1,3,4,5,6,7,8,9,13,14,17,24,28}(0.65) |
sonar | {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49} | {9,12,21,22,24,28,30,33,35,36,43,45,54}(0.35) | {7,16,17,18,19,21,23,24,26,33,38,39,40,41,42,46,47,48,50,51,55,59,60}(0.5) | {11,12,14,20,21,22,23,26,28,32,34,36,47}(0.4) |
libras | {1,4,6,9,10,12,14,16,17,18,19,20,21,22,23,24,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,87,88,89,90,} | {5,12,26,43,49,54,65,74,89}(0.15) | {14,16,18,20,22,24,26,28,30,32,34,36,38,40,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,71,73,75,77,79,81,83,85,87}(0.55) | {1,2,3,6,8,12,19,21,23,24,29,33,35,36,37,38,41,42,44,47,48,49,54,57,58,59,60,62,69,70,71,72,74,75,77,82,84,86,87,89,90}(0.6) |
DLBCL | {155,211,569,1 009,1 174, 1 373,1 958,3 331,4 234,4 489,4 594,5 152,6 088,6 096} | {555,1 243,2 128,2 173,2 411,3 326,4 116,5 292,5 998,6 179}(0.6) | {148,931,1 198,1 455,2 944,6 214,6 367,6 728}(0.3) | {923,1 839,2 006,4 194,5 119,5 198,5 292,6 179,6 837}(0.7) |
Tab. 3 Attribute reduction results comparision of different algorithms
数据集 | DRS | NRS | NRMIE | HANDI |
---|---|---|---|---|
iris | {1,2,3,4} | {1,3,4}(0.01) | {1,3}(0.25) | {3,4}(0.05) |
wine | {1,2,3,4,5,6,7,8,9} | {1,7,10}(0.01) | {6,7,9,12 }(0.3) | {7,10}(0.01) |
glass | {2,3,4,5,6,7,8,9} | {2,3,4,5,6,7,8,9}(0.2) | {1,2,4,5,7,8}(0.6) | {2,3,4,7,9}(0.25) |
forest | {1,2,4,5,6,7,8,9} | {1,4,5,6,8,10,11,12,13}(0.15) | {4,9,10,12,13}(0.5) | {1,2,4,5,8,12,13}(0.2) |
breast | {1,2,3,4,5,6,7,8,9} | {1,2,6}(0.3) | {2,3,4,5,8,9}(0.45) | {1,2,6}(0.25) |
QSAR | {1,2,3,4,8,9} | {1,3,7,8,10,13,14,20,27,31,35,37,38}(0.1) | {5,6,11,18,21,23,26,34,38,41}(0.1) | {1,7,8,10,12,14,22,25,27,30,31,33,35,37,38}(0.15) |
banknote | {1,2,3,4} | {1,2,3}(0.01) | {2,3}(0.3) | {1,2,3}(0.01) |
wireless | {1,2,3,4,5,6,7} | {1,5,6}(0.01) | {1,3,4}(0.5) | {1,4,5,7}(0.05) |
wine quality | {1,2,3,4,5,6,7,8,9} | {3,6,8,11}(0.45) | {1,2,4,6,8}(0.6) | {1,2,3,4,5,6,7,9,10,11}(0.1) |
electrical | {1,5,6,7,8,9} | {4,11,13}(0.01) | {3,5,6,8,9,11}(0.6) | {13} (0.01) |
iono | {4,6,8,9,10,11,12,14,16,17,18,19,20,22,23,24,27,29,30,31,32,33,34} | {1,3,5,7,8,11,13,15,19,27}(0.5) | {4,6,8,9,16,20,27,28,31,33,34}(0.05) | {1,3,4,5,6,7,8,9,13,14,17,24,28}(0.65) |
sonar | {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49} | {9,12,21,22,24,28,30,33,35,36,43,45,54}(0.35) | {7,16,17,18,19,21,23,24,26,33,38,39,40,41,42,46,47,48,50,51,55,59,60}(0.5) | {11,12,14,20,21,22,23,26,28,32,34,36,47}(0.4) |
libras | {1,4,6,9,10,12,14,16,17,18,19,20,21,22,23,24,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,87,88,89,90,} | {5,12,26,43,49,54,65,74,89}(0.15) | {14,16,18,20,22,24,26,28,30,32,34,36,38,40,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,71,73,75,77,79,81,83,85,87}(0.55) | {1,2,3,6,8,12,19,21,23,24,29,33,35,36,37,38,41,42,44,47,48,49,54,57,58,59,60,62,69,70,71,72,74,75,77,82,84,86,87,89,90}(0.6) |
DLBCL | {155,211,569,1 009,1 174, 1 373,1 958,3 331,4 234,4 489,4 594,5 152,6 088,6 096} | {555,1 243,2 128,2 173,2 411,3 326,4 116,5 292,5 998,6 179}(0.6) | {148,931,1 198,1 455,2 944,6 214,6 367,6 728}(0.3) | {923,1 839,2 006,4 194,5 119,5 198,5 292,6 179,6 837}(0.7) |
数据集 | DRS | NRS | NRMIE | HANDI | ARCD | |||||
---|---|---|---|---|---|---|---|---|---|---|
属性个数 | 约简率/% | 属性个数 | 约简率/% | 属性个数 | 约简率/% | 属性个数 | 约简率/% | 属性个数 | 约简率/% | |
平均 | 16.79 | 28.64 | 6.71 | 60.02 | 10.14 | 57.94 | 9.14 | 58.50 | 7.43 | 67.70 |
iris | 4 | 0.00 | 3 | 25.00 | 2 | 50.00 | 2 | 50.00 | 3 | 25.00 |
wine | 9 | 30.77 | 3 | 76.92 | 4 | 69.23 | 2 | 84.62 | 4 | 69.23 |
glass | 8 | 11.11 | 8 | 11.11 | 6 | 33.33 | 5 | 44.44 | 4 | 55.56 |
forest | 8 | 38.46 | 9 | 30.77 | 5 | 61.54 | 7 | 46.15 | 1 | 92.31 |
breast | 9 | 0.00 | 3 | 66.67 | 6 | 33.33 | 3 | 66.67 | 4 | 55.56 |
QSAR | 6 | 85.37 | 13 | 68.29 | 10 | 75.61 | 15 | 63.41 | 6 | 85.37 |
banknote | 4 | 0.00 | 3 | 25.00 | 2 | 50.00 | 3 | 25.00 | 3 | 25.00 |
wireless | 7 | 0.00 | 3 | 57.14 | 3 | 57.14 | 4 | 42.86 | 4 | 42.86 |
wine quality | 9 | 18.18 | 4 | 63.64 | 5 | 54.55 | 10 | 9.09 | 4 | 63.64 |
electrical | 6 | 53.85 | 3 | 76.92 | 6 | 53.85 | 1 | 92.31 | 1 | 92.31 |
iono | 23 | 32.35 | 10 | 70.59 | 11 | 67.65 | 13 | 61.76 | 5 | 85.29 |
sonar | 48 | 20.00 | 13 | 78.33 | 23 | 61.67 | 13 | 78.33 | 13 | 78.33 |
libras | 80 | 11.11 | 9 | 90.00 | 51 | 43.33 | 41 | 54.44 | 20 | 77.78 |
DLBCL | 14 | 99.80 | 10 | 99.86 | 8 | 99.89 | 9 | 99.87 | 32 | 99.55 |
Tab. 4 Comparison of attribute number and reduction rate of reduction results by different algorithms
数据集 | DRS | NRS | NRMIE | HANDI | ARCD | |||||
---|---|---|---|---|---|---|---|---|---|---|
属性个数 | 约简率/% | 属性个数 | 约简率/% | 属性个数 | 约简率/% | 属性个数 | 约简率/% | 属性个数 | 约简率/% | |
平均 | 16.79 | 28.64 | 6.71 | 60.02 | 10.14 | 57.94 | 9.14 | 58.50 | 7.43 | 67.70 |
iris | 4 | 0.00 | 3 | 25.00 | 2 | 50.00 | 2 | 50.00 | 3 | 25.00 |
wine | 9 | 30.77 | 3 | 76.92 | 4 | 69.23 | 2 | 84.62 | 4 | 69.23 |
glass | 8 | 11.11 | 8 | 11.11 | 6 | 33.33 | 5 | 44.44 | 4 | 55.56 |
forest | 8 | 38.46 | 9 | 30.77 | 5 | 61.54 | 7 | 46.15 | 1 | 92.31 |
breast | 9 | 0.00 | 3 | 66.67 | 6 | 33.33 | 3 | 66.67 | 4 | 55.56 |
QSAR | 6 | 85.37 | 13 | 68.29 | 10 | 75.61 | 15 | 63.41 | 6 | 85.37 |
banknote | 4 | 0.00 | 3 | 25.00 | 2 | 50.00 | 3 | 25.00 | 3 | 25.00 |
wireless | 7 | 0.00 | 3 | 57.14 | 3 | 57.14 | 4 | 42.86 | 4 | 42.86 |
wine quality | 9 | 18.18 | 4 | 63.64 | 5 | 54.55 | 10 | 9.09 | 4 | 63.64 |
electrical | 6 | 53.85 | 3 | 76.92 | 6 | 53.85 | 1 | 92.31 | 1 | 92.31 |
iono | 23 | 32.35 | 10 | 70.59 | 11 | 67.65 | 13 | 61.76 | 5 | 85.29 |
sonar | 48 | 20.00 | 13 | 78.33 | 23 | 61.67 | 13 | 78.33 | 13 | 78.33 |
libras | 80 | 11.11 | 9 | 90.00 | 51 | 43.33 | 41 | 54.44 | 20 | 77.78 |
DLBCL | 14 | 99.80 | 10 | 99.86 | 8 | 99.89 | 9 | 99.87 | 32 | 99.55 |
数据集 | 原始属性 | 不同算法的分类精度 | ||||
---|---|---|---|---|---|---|
DRS(3) | NRS(3) | NRMIE(1) | HANDI(3) | ARCD(9) | ||
平均 | 90.57±3.57 | 89.97±4.14 | 92.93±3.50 | 87.84±3.41 | 93.48±3.19 | 94.78±3.32 |
iris | 97.11±2.48 | 96.32±2.41 | 97.37±2.04 | 96.84±2.29 | 96.84±1.58 | 97.37±2.04 |
wine | 91.56±6.94 | 73.11±6.08 | 94.89±2.64 | 89.56±2.64 | 94.00±2.64 | 97.56±3.06 |
glass | 83.33±4.07 | 90.37±8.02 | 92.41±7.24 | 74.26±3.36 | 91.48±7.37 | 92.22±6.62 |
forest | 88.20±3.26 | 89.84±2.52 | 86.23±2.22 | 90.33±2.69 | 89.02±2.75 | 93.93±1.80 |
breast | 97.07±0.78 | 96.49±1.28 | 96.67±1.23 | 97.08±1.11 | 97.25±1.20 | 96.61±1.43 |
QSAR | 85.53±1.75 | 92.42±5.53 | 91.97±5.82 | 82.46±1.86 | 84.17±2.69 | 85.64±2.27 |
banknote | 100.00±0.00 | 100.00±0.00 | 100.00±0.00 | 91.92±1.61 | 100.00±0.00 | 100.00±0.00 |
wireless | 98.40±0.50 | 98.70±0.49 | 97.64±0.31 | 92.90±1.13 | 98.30±0.37 | 98.20±0.48 |
wine quality | 85.49±9.88 | 85.13±10.05 | 86.04±9.11 | 84.96±10.40 | 85.94±9.44 | 87.08±8.97 |
electrical | 82.23±0.96 | 71.19±1.82 | 95.71±2.89 | 72.72±1.40 | 99.98±0.02 | 99.98±0.02 |
iono | 95.57±2.46 | 93.64±3.26 | 92.84±2.33 | 95.57±3.03 | 93.18±3.63 | 96.25±4.00 |
sonar | 86.35±6.46 | 93.73±3.75 | 91.77±3.31 | 91.54±6.03 | 92.96±3.49 | 95.58±3.45 |
libras | 93.67±4.56 | 93.11±4.49 | 94.44±4.33 | 95.67±3.96 | 93.11±4.30 | 91.44±6.87 |
DLBCL | 83.50±5.94 | 85.50±8.20 | 83.00±5.57 | 74.00±6.24 | 92.50±5.12 | 95.00±5.48 |
Tab. 5 Comparison of classification accuracy of KNN based attribute reduction
数据集 | 原始属性 | 不同算法的分类精度 | ||||
---|---|---|---|---|---|---|
DRS(3) | NRS(3) | NRMIE(1) | HANDI(3) | ARCD(9) | ||
平均 | 90.57±3.57 | 89.97±4.14 | 92.93±3.50 | 87.84±3.41 | 93.48±3.19 | 94.78±3.32 |
iris | 97.11±2.48 | 96.32±2.41 | 97.37±2.04 | 96.84±2.29 | 96.84±1.58 | 97.37±2.04 |
wine | 91.56±6.94 | 73.11±6.08 | 94.89±2.64 | 89.56±2.64 | 94.00±2.64 | 97.56±3.06 |
glass | 83.33±4.07 | 90.37±8.02 | 92.41±7.24 | 74.26±3.36 | 91.48±7.37 | 92.22±6.62 |
forest | 88.20±3.26 | 89.84±2.52 | 86.23±2.22 | 90.33±2.69 | 89.02±2.75 | 93.93±1.80 |
breast | 97.07±0.78 | 96.49±1.28 | 96.67±1.23 | 97.08±1.11 | 97.25±1.20 | 96.61±1.43 |
QSAR | 85.53±1.75 | 92.42±5.53 | 91.97±5.82 | 82.46±1.86 | 84.17±2.69 | 85.64±2.27 |
banknote | 100.00±0.00 | 100.00±0.00 | 100.00±0.00 | 91.92±1.61 | 100.00±0.00 | 100.00±0.00 |
wireless | 98.40±0.50 | 98.70±0.49 | 97.64±0.31 | 92.90±1.13 | 98.30±0.37 | 98.20±0.48 |
wine quality | 85.49±9.88 | 85.13±10.05 | 86.04±9.11 | 84.96±10.40 | 85.94±9.44 | 87.08±8.97 |
electrical | 82.23±0.96 | 71.19±1.82 | 95.71±2.89 | 72.72±1.40 | 99.98±0.02 | 99.98±0.02 |
iono | 95.57±2.46 | 93.64±3.26 | 92.84±2.33 | 95.57±3.03 | 93.18±3.63 | 96.25±4.00 |
sonar | 86.35±6.46 | 93.73±3.75 | 91.77±3.31 | 91.54±6.03 | 92.96±3.49 | 95.58±3.45 |
libras | 93.67±4.56 | 93.11±4.49 | 94.44±4.33 | 95.67±3.96 | 93.11±4.30 | 91.44±6.87 |
DLBCL | 83.50±5.94 | 85.50±8.20 | 83.00±5.57 | 74.00±6.24 | 92.50±5.12 | 95.00±5.48 |
数据集 | 原始属性 | 不同算法的分类精度 | ||||
---|---|---|---|---|---|---|
DRS (2) | NRS (4) | NRMIE (1) | HANDI (4) | ARCD (7) | ||
平均 | 94.02±3.78 | 90.47±4.04 | 93.61±3.70 | 89.12±4.69 | 93.41±3.66 | 95.11±3.57 |
iris | 96.32±2.11 | 95.79±2.11 | 99.47±1.05 | 95.26±3.07 | 97.63±2.19 | 99.47±1.05 |
wine | 97.78±2.63 | 96.22±3.73 | 97.78±1.99 | 84.00±7.01 | 97.56±1.56 | 97.11±2.63 |
glass | 87.78±8.65 | 90.19±7.68 | 82.04±7.13 | 89.81±7.60 | 86.48±9.94 | 90.37±7.80 |
forest | 95.08±4.41 | 94.43±1.83 | 94.43±2.34 | 89.02±2.32 | 95.57±2.08 | 93.77±2.62 |
breast | 98.48±1.20 | 98.48±1.12 | 98.42±0.98 | 97.60±2.07 | 96.32±1.33 | 97.84±0.74 |
QSAR | 90.53±1.94 | 77.46±3.01 | 94.73±4.43 | 83.22±1.30 | 85.00±2.35 | 95.16±2.84 |
banknote | 99.48±0.48 | 99.53±0.44 | 99.62±0.35 | 95.82±3.08 | 99.62±0.35 | 99.62±0.35 |
wireless | 97.60±0.53 | 98.16±0.92 | 97.14±0.69 | 92.96±1.47 | 98.20±1.07 | 98.08±0.63 |
wine quality | 85.00±8.97 | 86.78±9.40 | 86.82±9.55 | 86.02±9.48 | 86.73±9.27 | 86.57±9.32 |
electrical | 99.99±0.02 | 68.53±0.80 | 99.99±0.02 | 71.02±1.01 | 100.00±0.00 | 100.00±0.00 |
iono | 92.84±4.58 | 96.36±3.66 | 95.91±3.38 | 96.70±2.89 | 95.68±1.75 | 97.50±2.08 |
sonar | 89.42±5.91 | 91.92±6.98 | 84.46±4.14 | 90.42±5.19 | 87.50±5.54 | 89.42±9.43 |
libras | 90.56±7.37 | 81.22±8.09 | 86.67±10.16 | 88.78±9.05 | 87.89±8.74 | 87.67±8.44 |
DLBCL | 95.50±4.15 | 91.50±6.73 | 93.00±5.57 | 87.00±10.05 | 93.50±5.02 | 99.00±2.00 |
Tab. 6 Comparison of classification accuracy of Decision Tree based attribute reduction
数据集 | 原始属性 | 不同算法的分类精度 | ||||
---|---|---|---|---|---|---|
DRS (2) | NRS (4) | NRMIE (1) | HANDI (4) | ARCD (7) | ||
平均 | 94.02±3.78 | 90.47±4.04 | 93.61±3.70 | 89.12±4.69 | 93.41±3.66 | 95.11±3.57 |
iris | 96.32±2.11 | 95.79±2.11 | 99.47±1.05 | 95.26±3.07 | 97.63±2.19 | 99.47±1.05 |
wine | 97.78±2.63 | 96.22±3.73 | 97.78±1.99 | 84.00±7.01 | 97.56±1.56 | 97.11±2.63 |
glass | 87.78±8.65 | 90.19±7.68 | 82.04±7.13 | 89.81±7.60 | 86.48±9.94 | 90.37±7.80 |
forest | 95.08±4.41 | 94.43±1.83 | 94.43±2.34 | 89.02±2.32 | 95.57±2.08 | 93.77±2.62 |
breast | 98.48±1.20 | 98.48±1.12 | 98.42±0.98 | 97.60±2.07 | 96.32±1.33 | 97.84±0.74 |
QSAR | 90.53±1.94 | 77.46±3.01 | 94.73±4.43 | 83.22±1.30 | 85.00±2.35 | 95.16±2.84 |
banknote | 99.48±0.48 | 99.53±0.44 | 99.62±0.35 | 95.82±3.08 | 99.62±0.35 | 99.62±0.35 |
wireless | 97.60±0.53 | 98.16±0.92 | 97.14±0.69 | 92.96±1.47 | 98.20±1.07 | 98.08±0.63 |
wine quality | 85.00±8.97 | 86.78±9.40 | 86.82±9.55 | 86.02±9.48 | 86.73±9.27 | 86.57±9.32 |
electrical | 99.99±0.02 | 68.53±0.80 | 99.99±0.02 | 71.02±1.01 | 100.00±0.00 | 100.00±0.00 |
iono | 92.84±4.58 | 96.36±3.66 | 95.91±3.38 | 96.70±2.89 | 95.68±1.75 | 97.50±2.08 |
sonar | 89.42±5.91 | 91.92±6.98 | 84.46±4.14 | 90.42±5.19 | 87.50±5.54 | 89.42±9.43 |
libras | 90.56±7.37 | 81.22±8.09 | 86.67±10.16 | 88.78±9.05 | 87.89±8.74 | 87.67±8.44 |
DLBCL | 95.50±4.15 | 91.50±6.73 | 93.00±5.57 | 87.00±10.05 | 93.50±5.02 | 99.00±2.00 |
数据集 | 原始属性 | 不同算法的分类精度 | ||||
---|---|---|---|---|---|---|
DRS (3) | NRS (3) | NRMIE (0) | HANDI (3) | ARCD (9) | ||
平均 | 86.27±2.92 | 82.65±2.50 | 83.98±2.54 | 77.01±3.47 | 81.77±3.09 | 86.14±2.55 |
iris | 97.36±1.42 | 97.37±2.63 | 98.16±1.69 | 95.79±2.41 | 85.00±3.73 | 98.16±1.69 |
wine | 92.33±1.02 | 93.67±2.48 | 93.56±2.89 | 81.56±4.45 | 93.56±3.78 | 95.33±3.36 |
glass | 66.30±7.89 | 61.67±3.52 | 61.67±3.52 | 63.89±7.33 | 61.85±5.93 | 66.48±5.00 |
forest | 93.11±3.50 | 88.85±2.30 | 90.66±3.88 | 91.31±2.94 | 92.13±3.26 | 94.59±2.65 |
breast | 93.86±1.96 | 97.37±0.60 | 94.91±1.14 | 88.36±1.62 | 96.61±1.59 | 96.43±0.99 |
QSAR | 66.89±2.12 | 67.58±1.81 | 82.95±2.27 | 81.82±2.00 | 71.29±1.58 | 75.91±3.08 |
banknote | 97.08±0.96 | 97.08±0.96 | 98.75±0.54 | 75.74±1.31 | 98.75±0.54 | 98.75±0.54 |
wireless | 91.00±0.92 | 93.86±0.83 | 81.60±1.10 | 75.86±1.14 | 91.30±1.13 | 92.44±0.97 |
wine quality | 53.52±1.14 | 51.42±1.30 | 48.42±1.20 | 51.26±1.13 | 52.35±2.00 | 52.57±1.21 |
electrical | 98.52±0.22 | 65.81±0.79 | 98.17±0.19 | 70.35±0.82 | 99.91±0.04 | 99.91±0.04 |
iono | 90.45±3.42 | 87.16±2.59 | 83.07±1.85 | 81.59±2.82 | 76.70±3.02 | 82.61±3.22 |
sonar | 86.73±4.34 | 80.91±4.63 | 81.73±4.81 | 74.04±3.25 | 81.15±4.28 | 82.50±3.89 |
libras | 86.67±5.40 | 80.89±5.52 | 66.11±7.47 | 72.56±7.12 | 81.67±4.87 | 73.34±4.42 |
DLBCL | 94.00±6.63 | 93.50±5.02 | 96.00±3.00 | 74.00±10.20 | 62.50±7.50 | 97.00±4.58 |
Tab. 7 Comparison of classification accuracy of SVM based attribute reduction
数据集 | 原始属性 | 不同算法的分类精度 | ||||
---|---|---|---|---|---|---|
DRS (3) | NRS (3) | NRMIE (0) | HANDI (3) | ARCD (9) | ||
平均 | 86.27±2.92 | 82.65±2.50 | 83.98±2.54 | 77.01±3.47 | 81.77±3.09 | 86.14±2.55 |
iris | 97.36±1.42 | 97.37±2.63 | 98.16±1.69 | 95.79±2.41 | 85.00±3.73 | 98.16±1.69 |
wine | 92.33±1.02 | 93.67±2.48 | 93.56±2.89 | 81.56±4.45 | 93.56±3.78 | 95.33±3.36 |
glass | 66.30±7.89 | 61.67±3.52 | 61.67±3.52 | 63.89±7.33 | 61.85±5.93 | 66.48±5.00 |
forest | 93.11±3.50 | 88.85±2.30 | 90.66±3.88 | 91.31±2.94 | 92.13±3.26 | 94.59±2.65 |
breast | 93.86±1.96 | 97.37±0.60 | 94.91±1.14 | 88.36±1.62 | 96.61±1.59 | 96.43±0.99 |
QSAR | 66.89±2.12 | 67.58±1.81 | 82.95±2.27 | 81.82±2.00 | 71.29±1.58 | 75.91±3.08 |
banknote | 97.08±0.96 | 97.08±0.96 | 98.75±0.54 | 75.74±1.31 | 98.75±0.54 | 98.75±0.54 |
wireless | 91.00±0.92 | 93.86±0.83 | 81.60±1.10 | 75.86±1.14 | 91.30±1.13 | 92.44±0.97 |
wine quality | 53.52±1.14 | 51.42±1.30 | 48.42±1.20 | 51.26±1.13 | 52.35±2.00 | 52.57±1.21 |
electrical | 98.52±0.22 | 65.81±0.79 | 98.17±0.19 | 70.35±0.82 | 99.91±0.04 | 99.91±0.04 |
iono | 90.45±3.42 | 87.16±2.59 | 83.07±1.85 | 81.59±2.82 | 76.70±3.02 | 82.61±3.22 |
sonar | 86.73±4.34 | 80.91±4.63 | 81.73±4.81 | 74.04±3.25 | 81.15±4.28 | 82.50±3.89 |
libras | 86.67±5.40 | 80.89±5.52 | 66.11±7.47 | 72.56±7.12 | 81.67±4.87 | 73.34±4.42 |
DLBCL | 94.00±6.63 | 93.50±5.02 | 96.00±3.00 | 74.00±10.20 | 62.50±7.50 | 97.00±4.58 |
数据集 | 原始属性 | 不同算法的分类精度 | ||||
---|---|---|---|---|---|---|
DRS (1) | NRS (3) | NRMIE (2) | HANDI (3) | ARCD (9) | ||
平均 | 88.26±2.86 | 82.86±3.15 | 88.49±2.90 | 83.98±3.93 | 89.53±2.75 | |
iris | 94.73±1.77 | 97.89±1.58 | 97.63±1.84 | 96.32±2.68 | 96.32±2.68 | 97.63±1.84 |
wine | 98.67±1.78 | 93.78±4.95 | 95.11±2.39 | 86.44±5.48 | 94.67±3.01 | 96.89±2.67 |
glass | 76.11±7.05 | 72.78±7.77 | 72.78±7.77 | 74.44±10.50 | 76.67±5.81 | 77.78±3.99 |
forest | 94.75±4.26 | 91.31±3.74 | 92.79±2.95 | 92.13±3.65 | 91.64±3.47 | 93.77±2.82 |
breast | 96.84±1.29 | 96.78±1.26 | 96.73±0.95 | 96.02±1.63 | 97.25±0.74 | 97.25±0.91 |
QSAR | 84.09±1.98 | 65.91±1.98 | 86.33±1.60 | 82.39±2.04 | 84.77±1.48 | 76.29±1.53 |
banknote | 99.94±0.12 | 99.94±0.12 | 100.00±0.00 | 86.50±1.34 | 100.00±0.00 | 100.00±0.00 |
wireless | 96.33±0.57 | 88.60±1.54 | 95.74±0.92 | 87.90±1.04 | 96.36±0.57 | 96.38±0.52 |
wine quality | 55.67±1.18 | 54.17±0.97 | 54.02±1.35 | 54.21±1.59 | 53.86±0.77 | 52.90±1.08 |
electrical | 94.72±0.34 | 67.87±0.92 | 82.68±0.96 | 71.78±0.58 | 99.98±0.02 | 99.98±0.02 |
iono | 87.50±2.97 | 91.93±3.15 | 93.07±2.61 | 94.97±3.45 | 91.93±3.15 | 94.89±1.70 |
sonar | 77.31±6.01 | 90.00±5.29 | 91.15±6.27 | 89.23±6.84 | 91.92±6.59 | 95.96±3.15 |
libras | 90.00±5.86 | 68.56±3.62 | 89.89±5.65 | 87.89±7.75 | 91.12±5.30 | 91.56±3.56 |
DLBCL | 89.00±4.90 | 80.50±7.23 | 91.00±5.39 | 75.50±6.50 | 87.00±4.90 | 90.50±6.50 |
Tab. 8 Comparison of classification accuracy of Neural Network based attribute reduction
数据集 | 原始属性 | 不同算法的分类精度 | ||||
---|---|---|---|---|---|---|
DRS (1) | NRS (3) | NRMIE (2) | HANDI (3) | ARCD (9) | ||
平均 | 88.26±2.86 | 82.86±3.15 | 88.49±2.90 | 83.98±3.93 | 89.53±2.75 | |
iris | 94.73±1.77 | 97.89±1.58 | 97.63±1.84 | 96.32±2.68 | 96.32±2.68 | 97.63±1.84 |
wine | 98.67±1.78 | 93.78±4.95 | 95.11±2.39 | 86.44±5.48 | 94.67±3.01 | 96.89±2.67 |
glass | 76.11±7.05 | 72.78±7.77 | 72.78±7.77 | 74.44±10.50 | 76.67±5.81 | 77.78±3.99 |
forest | 94.75±4.26 | 91.31±3.74 | 92.79±2.95 | 92.13±3.65 | 91.64±3.47 | 93.77±2.82 |
breast | 96.84±1.29 | 96.78±1.26 | 96.73±0.95 | 96.02±1.63 | 97.25±0.74 | 97.25±0.91 |
QSAR | 84.09±1.98 | 65.91±1.98 | 86.33±1.60 | 82.39±2.04 | 84.77±1.48 | 76.29±1.53 |
banknote | 99.94±0.12 | 99.94±0.12 | 100.00±0.00 | 86.50±1.34 | 100.00±0.00 | 100.00±0.00 |
wireless | 96.33±0.57 | 88.60±1.54 | 95.74±0.92 | 87.90±1.04 | 96.36±0.57 | 96.38±0.52 |
wine quality | 55.67±1.18 | 54.17±0.97 | 54.02±1.35 | 54.21±1.59 | 53.86±0.77 | 52.90±1.08 |
electrical | 94.72±0.34 | 67.87±0.92 | 82.68±0.96 | 71.78±0.58 | 99.98±0.02 | 99.98±0.02 |
iono | 87.50±2.97 | 91.93±3.15 | 93.07±2.61 | 94.97±3.45 | 91.93±3.15 | 94.89±1.70 |
sonar | 77.31±6.01 | 90.00±5.29 | 91.15±6.27 | 89.23±6.84 | 91.92±6.59 | 95.96±3.15 |
libras | 90.00±5.86 | 68.56±3.62 | 89.89±5.65 | 87.89±7.75 | 91.12±5.30 | 91.56±3.56 |
DLBCL | 89.00±4.90 | 80.50±7.23 | 91.00±5.39 | 75.50±6.50 | 87.00±4.90 | 90.50±6.50 |
1 | PAWLAK Z. Rough sets[J]. International Journal of Information and Computer Sciences, 1982, 11(5): 341-356. 10.1007/bf01001956 |
2 | PAWLAK Z. Rough Sets: Theoretical Aspects of Reasoning about Data, TDLD 11[M]. Dordrecht: Springer, 1991: 9-17. 10.1007/978-94-011-3534-4_7 |
3 | ZHANG X, MEI C L, CHEN D G, et al. Feature selection in mixed data: a method using a novel fuzzy rough set-based information entropy[J]. Pattern Recognition, 2016, 56: 1-15. 10.1016/j.patcog.2016.02.013 |
4 | YANG J L, ZHANG X Y, QIN K Y. Constructing robust fuzzy rough set models based on three-way decisions[EB/OL].[2021-03-20]. . 10.1007/s12559-021-09863-4 |
5 | XIONG C Z, QIAN W B, WANG Y L, et al. Feature selection based on label distribution and fuzzy mutual information[J]. Information Sciences, 2021, 574: 297-319. 10.1016/j.ins.2021.06.005 |
6 | SANG B B, CHEN H M, YANG L, et al. Feature selection for dynamic interval-valued ordered data based on fuzzy dominance neighborhood rough set[J]. Knowledge-Based Systems, 2021, 227: No.107223. 10.1016/j.knosys.2021.107223 |
7 | YUAN Z, CHEN H M, LI T R, et al. Unsupervised attribute reduction for mixed data based on fuzzy rough sets[J]. Information Sciences, 2021, 572: 67-87. 10.1016/j.ins.2021.04.083 |
8 | GRECO S, MATARAZZO B, SLOWINSKI R. Rough sets theory for multicriteria decision analysis[J]. European Journal of Operational Research, 2001, 129(1): 1-47. 10.1016/s0377-2217(00)00167-3 |
9 | HUANG Q Q, LI T R, HUANG Y Y, et al. Dynamic dominance rough set approach for processing composite ordered data[J]. Knowledge-Based Systems, 2020, 187: No.104829. 10.1016/j.knosys.2019.06.037 |
10 | AHMAD A, QAMAR U, RAZA M S. An optimized method to calculate approximations in dominance based rough set approach[J]. Applied Soft Computing, 2020, 97(Pt B): No.106731. 10.1016/j.asoc.2020.106731 |
11 | PALANGETIC M, CORNELIS C, GRECO S, et al. Fuzzy extensions of the dominance-based rough set approach[J]. International Journal of Approximate Reasoning, 2021, 129: 1-19. 10.1016/j.ijar.2020.10.004 |
12 | HU Q H, YU D R, LIU J F, et al. Neighborhood rough set based heterogeneous feature subset selection[J]. Information Sciences, 2008, 178(18): 3577-3594. 10.1016/j.ins.2008.05.024 |
13 | WANG C Z, HU Q H, WANG X Z, et al. Feature selection based on neighborhood discrimination index[J]. IEEE Transactions on Neural Networks and Learning Systems, 2018, 29(7): 2986-2999. 10.1109/tnnls.2018.2830700 |
14 | 姚晟,徐风,吴照玉,等. 基于邻域粗糙互信息熵的非单调性属性约简[J]. 控制与决策, 2019, 34(2): 353-361. 10.13195/j.kzyjc.2017.1065 |
YAO S, XU F, WU Z Y, et al. Non-monotonic attribute reduction based on neighborhood rough mutual information entropy[J]. Control and Decision, 2019, 34(2): 353-361. 10.13195/j.kzyjc.2017.1065 | |
15 | 郑文彬,李进金,何秋红. 基于属性重要度的变精度邻域粗糙集属性约简算法[J]. 计算机科学, 2019, 46(12): 261-265. 10.11896/jsjkx.181102184 |
ZHENG W B, LI J J, HE Q H. Attribute reduction algorithm for neighborhood rough sets with variable precision based on attribute importance[J]. Computer Science, 2019, 46(12): 261-265. 10.11896/jsjkx.181102184 | |
16 | 刘丹,徐立新,李敬伟. 不完备邻域多粒度决策理论粗糙集与三支决策[J]. 计算机应用与软件, 2019, 36(5): 145-157. 10.3969/j.issn.1000-386x.2019.05.026 |
LIU D, XU L X, LI J W. Incomplete neighborhood multi-granulation decision-theoretic rough set and three-way decision[J]. Computer Applications and Software, 2019, 36(5): 145-157. 10.3969/j.issn.1000-386x.2019.05.026 | |
17 | 吴将,宋晶晶,程富豪,等. 面向连续参数的多粒度属性约简方法研究[J]. 计算机科学与探索, 2021, 15(8):1555-1562. 10.3778/j.issn.1673-9418.2006061 |
WU J, SONG J J, CHENG F H, et al. Research on multi-granularity attribute reduction method for continuous parameters[J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(8):1555-1562. 10.3778/j.issn.1673-9418.2006061 | |
18 | 钱宇华. 复杂数据的粒化机理与数据建模[D]. 太原:山西大学, 2011:18-20. |
QIAN Y H. Granulation mechanism and data modeling for complex data[D]. Taiyuan: Shanxi University, 2011:18-20. | |
19 | YUAN C H, YANG H T. Research on K-value selection method of K-means clustering algorithm[J]. J: Multidisciplinary Scientific Journal, 2019, 2(2): 226-235. 10.3390/j2020016 |
20 | SINAGA K P, YANG M S. Unsupervised k-means clustering algorithm[J]. IEEE Access, 2020, 8: 80716-80727. 10.1109/access.2020.2988796 |
21 | MENÉNDEZ M L, PARDO J A, PARDO L, et al. The Jensen-Shannon divergence[J]. Journal of the Franklin Institute, 1997, 334(2): 307-318. 10.1016/s0016-0032(96)00063-4 |
22 | JOSHI R, KUMAR S. A dissimilarity Jensen-Shannon divergence measure for intuitionistic fuzzy sets[J]. International Journal of Intelligent Systems, 2018, 33(11): 2216-2235. 10.1002/int.22026 |
23 | NIELSEN F. On the Jensen-Shannon symmetrization of distances relying on abstract means[J]. Entropy, 2019, 21(5): No.485. 10.3390/e21050485 |
24 | SKOWRON A, RAUSZER C. The discernibility matrices and functions in information systems[M]// SŁOWIŃSKI R. Intelligent Decision Support: Handbook of Applications and Advances of the Rough Sets Theory, TDLD 11. Dordrecht: Springer, 1992: 331-362. |
25 | KULLBACK S, LEIBLER R A. On information and sufficiency[J]. The Annals of Mathematical Statistics, 1951, 22(1): 79-86. 10.1214/aoms/1177729694 |
26 | ARJOVSKY M, CHINTALA S, BOTTOU L. Wasserstein generative adversarial networks[C]// Proceedings of the 34th International Conference on Machine Learning. New York: JMLR.org, 2017: 214-223. |
27 | DUA D, GRAFF C. UCI Machine learning repository[DB/OL]. [2020-12-17].. |
28 | LI J Y, LIU H Q. Kent ridge bio-medical data set repository[DB/OL]. [2021-09-19].. 10.1109/bigdata50022.2020.9378472 |
[1] | Yuhao TANG, Dezhong PENG, Zhong YUAN. Fuzzy multi-granularity anomaly detection for incomplete mixed data [J]. Journal of Computer Applications, 2024, 44(10): 3097-3104. |
[2] | Yuanjiang LI, Jinsheng QUAN, Yangyi TAN, Tian YANG. Attribute reduction for high-dimensional data based on bi-view of similarity and difference [J]. Journal of Computer Applications, 2023, 43(5): 1467-1472. |
[3] | Lin SUN, Tianjiao MA, Zhan’ao XUE. Multilabel feature selection algorithm based on Fisher score and fuzzy neighborhood entropy [J]. Journal of Computer Applications, 2023, 43(12): 3779-3789. |
[4] | Lei MA, Chuan LUO, Tianrui LI, Hongmei CHEN. Fuzzy-rough set based unsupervised dynamic feature selection algorithm [J]. Journal of Computer Applications, 2023, 43(10): 3121-3128. |
[5] | Lin SUN, Jing ZHAO, Jiucheng XU, Xinya WANG. Feature selection algorithm based on neighborhood rough set and monarch butterfly optimization [J]. Journal of Computer Applications, 2022, 42(5): 1355-1366. |
[6] | Chao LIU, Lei WANG, Wen YANG, Qiangqiang ZHONG, Min LI. Incremental attribute reduction method for set-valued decision information system with variable attribute sets [J]. Journal of Computer Applications, 2022, 42(2): 463-468. |
[7] | Shunkun YU, Hongxu YAN. Heuristic attribute value reduction model based on certainty factor [J]. Journal of Computer Applications, 2022, 42(2): 469-474. |
[8] | Meng KANG, Zuqiang MENG. Efficient attribute reduction algorithm based on local conditional discernibility [J]. Journal of Computer Applications, 2022, 42(2): 449-456. |
[9] | WANG Xiaorong, ZHANG Yuzhao, ZHANG Zhenjiang. Selection of express freight transportation schemes based on rough set over two universes [J]. Journal of Computer Applications, 2021, 41(5): 1500-1505. |
[10] | PENG Li, ZHANG Haiqing, LI Daiwei, TANG Dan, YU Xi, HE Lei. Imputation algorithm for hybrid information system of incomplete data analysis approach based on rough set theory [J]. Journal of Computer Applications, 2021, 41(3): 677-685. |
[11] | WANG Lei. Network intrusion detection method based on improved rough set attribute reduction and K-means clustering [J]. Journal of Computer Applications, 2020, 40(7): 1996-2002. |
[12] | ZHANG Wu, CHEN Hongmei. Hyperspectral band selection based on multi-kernelized fuzzy rough set and grasshopper optimization algorithm [J]. Journal of Computer Applications, 2020, 40(5): 1425-1430. |
[13] | Xiajie ZHANG, Jinghua ZHU, Yang CHEN. Distributed rough set attribute reduction algorithm under Spark [J]. Journal of Computer Applications, 2020, 40(2): 518-523. |
[14] | OU Binli, ZHONG Xiaru, DAI Jianhua, YANG Tian. Intrusion detection method based on variable precision covering rough set [J]. Journal of Computer Applications, 2020, 40(12): 3465-3470. |
[15] | LI Ziying, SHI Zhenguo. Scheduling method for big data tasks [J]. Journal of Computer Applications, 2020, 40(10): 2923-2928. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||