Journal of Computer Applications ›› 2022, Vol. 42 ›› Issue (3): 688-694.DOI: 10.11772/j.issn.1001-9081.2021040789
Special Issue: 人工智能; 2021年中国计算机学会人工智能会议(CCFAI 2021)
• 2021 CCF Conference on Artificial Intelligence (CCFAI 2021) • Previous Articles Next Articles
Xiaoqing ZHANG1,2, Chenxi WANG1,2(), Yan LYU1,2, Yaojin LIN1,2
Received:
2021-05-17
Revised:
2021-07-11
Accepted:
2021-07-14
Online:
2021-11-09
Published:
2022-03-10
Contact:
Chenxi WANG
About author:
ZHANG Xiaoqing, born in 1998, M. S. candidate. Her research interests include data mining.Supported by:
张小清1,2, 王晨曦1,2(), 吕彦1,2, 林耀进1,2
通讯作者:
王晨曦
作者简介:
张小清(1998—),女,福建泉州人,硕士研究生,CCF会员,主要研究方向:数据挖掘基金资助:
CLC Number:
Xiaoqing ZHANG, Chenxi WANG, Yan LYU, Yaojin LIN. Hierarchical classification online streaming feature selection algorithm based on ReliefF algorithm[J]. Journal of Computer Applications, 2022, 42(3): 688-694.
张小清, 王晨曦, 吕彦, 林耀进. 基于ReliefF的层次分类在线流特征选择算法[J]. 《计算机应用》唯一官方网站, 2022, 42(3): 688-694.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2021040789
数据集 | 类型 | 样本数 | 特征数 | 类别数 | 节点数 | 层数 |
---|---|---|---|---|---|---|
AWA | Image | 6 405 | 252 | 10 | 17 | 4 |
Bridges | Image | 108 | 12 | 6 | 8 | 3 |
Cifar | Image | 50 000 | 512 | 100 | 21 | 3 |
DD | Protein | 3 625 | 473 | 27 | 32 | 3 |
F194 | Protein | 8 525 | 473 | 194 | 202 | 3 |
VOC | Image | 7 178 | 1 000 | 88 | 30 | 5 |
Tab. 1 Description of datasets
数据集 | 类型 | 样本数 | 特征数 | 类别数 | 节点数 | 层数 |
---|---|---|---|---|---|---|
AWA | Image | 6 405 | 252 | 10 | 17 | 4 |
Bridges | Image | 108 | 12 | 6 | 8 | 3 |
Cifar | Image | 50 000 | 512 | 100 | 21 | 3 |
DD | Protein | 3 625 | 473 | 27 | 32 | 3 |
F194 | Protein | 8 525 | 473 | 194 | 202 | 3 |
VOC | Image | 7 178 | 1 000 | 88 | 30 | 5 |
数据集 | OSFS | FOSFS | SAOLA | A3M | OFSD | OH_Relief |
---|---|---|---|---|---|---|
均值 | 0.289 6 | 0.264 1 | 0.261 7 | 0.306 8 | 0.424 9 | |
AWA | 0.193 9 | 0.171 7 | 0.219 0 | 0.180 2 | 0.194 2 | |
Bridges | 0.630 0 | 0.417 8 | 0.630 0 | 0.548 9 | 0.507 8 | |
Cifar | 0.067 3 | 0.075 4 | 0.015 1 | 0.258 5 | 0.149 1 | |
DD | 0.405 8 | 0.405 8 | 0.317 2 | 0.558 4 | 0.704 5 | |
F194 | 0.242 9 | 0.259 6 | 0.191 9 | 0.221 7 | 0.570 6 | |
VOC | 0.192 3 | 0.202 6 | 0.176 8 | 0.223 9 | 0.282 4 |
Tab. 2 Classification accuracy based on KNN classifier(↑)
数据集 | OSFS | FOSFS | SAOLA | A3M | OFSD | OH_Relief |
---|---|---|---|---|---|---|
均值 | 0.289 6 | 0.264 1 | 0.261 7 | 0.306 8 | 0.424 9 | |
AWA | 0.193 9 | 0.171 7 | 0.219 0 | 0.180 2 | 0.194 2 | |
Bridges | 0.630 0 | 0.417 8 | 0.630 0 | 0.548 9 | 0.507 8 | |
Cifar | 0.067 3 | 0.075 4 | 0.015 1 | 0.258 5 | 0.149 1 | |
DD | 0.405 8 | 0.405 8 | 0.317 2 | 0.558 4 | 0.704 5 | |
F194 | 0.242 9 | 0.259 6 | 0.191 9 | 0.221 7 | 0.570 6 | |
VOC | 0.192 3 | 0.202 6 | 0.176 8 | 0.223 9 | 0.282 4 |
数据集 | OSFS | FOSFS | SAOLA | A3M | OFSD | OH_Relief |
---|---|---|---|---|---|---|
平均值 | 0.554 2 | 0.543 6 | 0.535 3 | 0.569 6 | 0.637 1 | |
AWA | 0.452 0 | 0.434 8 | 0.467 9 | 0.443 9 | 0.449 0 | |
Bridges | 0.785 2 | 0.674 1 | 0.785 2 | 0.740 7 | 0.715 7 | |
Cifar | 0.389 9 | 0.396 1 | 0.351 4 | 0.523 4 | 0.447 4 | |
DD | 0.659 8 | 0.659 8 | 0.596 9 | 0.749 9 | 0.833 4 | |
F194 | 0.566 4 | 0.583 0 | 0.534 6 | 0.563 0 | 0.753 6 | |
VOC | 0.471 7 | 0.479 3 | 0.460 7 | 0.497 4 | 0.537 8 |
Tab. 3 LCA-F1 values based on KNN classifier(↑)
数据集 | OSFS | FOSFS | SAOLA | A3M | OFSD | OH_Relief |
---|---|---|---|---|---|---|
平均值 | 0.554 2 | 0.543 6 | 0.535 3 | 0.569 6 | 0.637 1 | |
AWA | 0.452 0 | 0.434 8 | 0.467 9 | 0.443 9 | 0.449 0 | |
Bridges | 0.785 2 | 0.674 1 | 0.785 2 | 0.740 7 | 0.715 7 | |
Cifar | 0.389 9 | 0.396 1 | 0.351 4 | 0.523 4 | 0.447 4 | |
DD | 0.659 8 | 0.659 8 | 0.596 9 | 0.749 9 | 0.833 4 | |
F194 | 0.566 4 | 0.583 0 | 0.534 6 | 0.563 0 | 0.753 6 | |
VOC | 0.471 7 | 0.479 3 | 0.460 7 | 0.497 4 | 0.537 8 |
数据集 | OSFS | FOSFS | SAOLA | A3M | OFSD | OH_Relief |
---|---|---|---|---|---|---|
平均值 | 2.587 3 | 2.606 1 | 2.704 4 | 2.456 1 | 2.125 7 | |
AWA | 3.825 8 | 3.932 6 | 3.697 1 | 3.832 9 | 3.847 3 | |
Bridges | 1.481 5 | 1.009 3 | 1.203 7 | 1.314 8 | 1.000 0 | |
Cifar | 3.590 6 | 3.548 5 | 3.843 8 | 2.753 6 | 3.227 7 | |
DD | 1.705 4 | 1.705 4 | 2.106 5 | 1.234 2 | 0.817 2 | |
F194 | 2.175 2 | 2.042 0 | 2.351 0 | 2.130 9 | 1.238 6 | |
VOC | 3.217 6 | 3.154 9 | 3.292 3 | 2.884 6 | 2.721 5 |
Tab. 4 TIE values based on KNN classifier(↓)
数据集 | OSFS | FOSFS | SAOLA | A3M | OFSD | OH_Relief |
---|---|---|---|---|---|---|
平均值 | 2.587 3 | 2.606 1 | 2.704 4 | 2.456 1 | 2.125 7 | |
AWA | 3.825 8 | 3.932 6 | 3.697 1 | 3.832 9 | 3.847 3 | |
Bridges | 1.481 5 | 1.009 3 | 1.203 7 | 1.314 8 | 1.000 0 | |
Cifar | 3.590 6 | 3.548 5 | 3.843 8 | 2.753 6 | 3.227 7 | |
DD | 1.705 4 | 1.705 4 | 2.106 5 | 1.234 2 | 0.817 2 | |
F194 | 2.175 2 | 2.042 0 | 2.351 0 | 2.130 9 | 1.238 6 | |
VOC | 3.217 6 | 3.154 9 | 3.292 3 | 2.884 6 | 2.721 5 |
数据集 | OSFS | FOSFS | SAOLA | A3M | OFSD | OH_Relief |
---|---|---|---|---|---|---|
平均值 | 0.292 5 | 0.302 6 | 0.267 2 | 0.263 1 | 0.477 3 | |
AWA | 0.208 0 | 0.219 5 | 0.178 1 | 0.186 7 | 0.281 0 | |
Bridges | 0.625 6 | 0.564 4 | 0.666 7 | |||
Cifar | 0.067 4 | 0.074 7 | 0.020 1 | 0.271 0 | 0.128 9 | |
DD | 0.370 4 | 0.370 7 | 0.292 9 | 0.307 9 | 0.792 5 | |
F194 | 0.219 7 | 0.225 2 | 0.087 9 | 0.101 0 | 0.593 6 | |
VOC | 0.259 4 | 0.267 1 | 0.256 8 | 0.289 8 | 0.359 3 |
Tab. 5 Classification accuracy based on LSVM classifier(↑)
数据集 | OSFS | FOSFS | SAOLA | A3M | OFSD | OH_Relief |
---|---|---|---|---|---|---|
平均值 | 0.292 5 | 0.302 6 | 0.267 2 | 0.263 1 | 0.477 3 | |
AWA | 0.208 0 | 0.219 5 | 0.178 1 | 0.186 7 | 0.281 0 | |
Bridges | 0.625 6 | 0.564 4 | 0.666 7 | |||
Cifar | 0.067 4 | 0.074 7 | 0.020 1 | 0.271 0 | 0.128 9 | |
DD | 0.370 4 | 0.370 7 | 0.292 9 | 0.307 9 | 0.792 5 | |
F194 | 0.219 7 | 0.225 2 | 0.087 9 | 0.101 0 | 0.593 6 | |
VOC | 0.259 4 | 0.267 1 | 0.256 8 | 0.289 8 | 0.359 3 |
数据集 | OSFS | FOSFS | SAOLA | A3M | OFSD | OH_Relief |
---|---|---|---|---|---|---|
平均值 | 0.558 0 | 0.565 2 | 0.541 4 | 0.534 0 | 0.672 4 | |
AWA | 0.464 3 | 0.471 2 | 0.448 4 | 0.448 2 | 0.512 9 | |
Bridges | 0.785 2 | 0.785 2 | 0.785 2 | 0.752 8 | 0.815 7 | |
Cifar | 0.390 8 | 0.396 7 | 0.355 9 | 0.532 8 | 0.435 2 | |
DD | 0.633 8 | 0.633 9 | 0.578 6 | 0.574 3 | 0.883 2 | |
F194 | 0.555 3 | 0.563 5 | 0.451 7 | 0.452 6 | 0.771 4 | |
VOC | 0.518 4 | 0.523 6 | 0.516 5 | 0.541 1 | 0.588 9 |
Tab. 6 LCA-F1 values based on LSVM classifier(↑)
数据集 | OSFS | FOSFS | SAOLA | A3M | OFSD | OH_Relief |
---|---|---|---|---|---|---|
平均值 | 0.558 0 | 0.565 2 | 0.541 4 | 0.534 0 | 0.672 4 | |
AWA | 0.464 3 | 0.471 2 | 0.448 4 | 0.448 2 | 0.512 9 | |
Bridges | 0.785 2 | 0.785 2 | 0.785 2 | 0.752 8 | 0.815 7 | |
Cifar | 0.390 8 | 0.396 7 | 0.355 9 | 0.532 8 | 0.435 2 | |
DD | 0.633 8 | 0.633 9 | 0.578 6 | 0.574 3 | 0.883 2 | |
F194 | 0.555 3 | 0.563 5 | 0.451 7 | 0.452 6 | 0.771 4 | |
VOC | 0.518 4 | 0.523 6 | 0.516 5 | 0.541 1 | 0.588 9 |
数据集 | OSFS | FOSFS | SAOLA | A3M | OFSD | OH_Relief |
---|---|---|---|---|---|---|
平均值 | 2.530 5 | 2.482 8 | 2.628 9 | 2.706 5 | 1.901 9 | |
AWA | 3.686 8 | 3.643 4 | 3.762 7 | 3.815 1 | 3.349 9 | |
Bridges | 1.009 3 | 1.009 3 | 1.009 3 | 1.138 9 | 0.842 6 | |
Cifar | 3.580 3 | 3.539 0 | 3.809 4 | 2.690 7 | 3.293 6 | |
DD | 1.875 9 | 1.875 9 | 2.228 4 | 2.339 9 | 0.570 9 | |
F194 | 2.213 3 | 2.137 0 | 2.930 0 | 2.971 3 | 1.115 6 | |
VOC | 2.817 5 | 2.787 7 | 2.826 8 | 2.680 0 | 2.396 6 |
Tab. 7 TIE values based on LSVM classifier(↓)
数据集 | OSFS | FOSFS | SAOLA | A3M | OFSD | OH_Relief |
---|---|---|---|---|---|---|
平均值 | 2.530 5 | 2.482 8 | 2.628 9 | 2.706 5 | 1.901 9 | |
AWA | 3.686 8 | 3.643 4 | 3.762 7 | 3.815 1 | 3.349 9 | |
Bridges | 1.009 3 | 1.009 3 | 1.009 3 | 1.138 9 | 0.842 6 | |
Cifar | 3.580 3 | 3.539 0 | 3.809 4 | 2.690 7 | 3.293 6 | |
DD | 1.875 9 | 1.875 9 | 2.228 4 | 2.339 9 | 0.570 9 | |
F194 | 2.213 3 | 2.137 0 | 2.930 0 | 2.971 3 | 1.115 6 | |
VOC | 2.817 5 | 2.787 7 | 2.826 8 | 2.680 0 | 2.396 6 |
数据集 | 原特征个数 | 特征个数 | ||
---|---|---|---|---|
A3M | OFSD | OH_Relief | ||
Bridges | 12 | 6 | 3 | 6 |
DD | 473 | 158 | 6 | 50 |
F194 | 473 | 4 | 6 | 50 |
Tab. 8 Number of selected feature of different algorithms on three datasets
数据集 | 原特征个数 | 特征个数 | ||
---|---|---|---|---|
A3M | OFSD | OH_Relief | ||
Bridges | 12 | 6 | 3 | 6 |
DD | 473 | 158 | 6 | 50 |
F194 | 473 | 4 | 6 | 50 |
1 | 胡清华, 王煜, 周玉灿, 等. 大规模分类任务的分层学习方法综述 [J]. 中国科学:信息科学, 2018, 48(5):7-20. 10.1360/n112017-00246 |
HU Q H, WANG Y, ZHOU Y C, et al. A review on hierarchical learning methods for large-scale classification task [J]. Scientia Sinica Informationis, 2018, 48(5): 487-500. 10.1360/n112017-00246 | |
2 | KIRA K, RENDELL L A. The feature selection problem: traditional methods and a new algorithm [C]// Proceedings of the 1992 Association for Advancement of Artificial Intelligence. Palo Alto: AAAI, 1992: 129-134. 10.1016/b978-1-55860-247-2.50037-1 |
3 | SPOLAÔR N, CHERMAN E A, MONARD M C, et al. ReliefF for multi-label feature selection [C]// Proceedings of the 2013 Brazilian Conference on Intelligent Systems. Piscataway: IEEE, 2013: 6-11. 10.1109/bracis.2013.10 |
4 | KONG D, DING C, HUANG H, et al. Multi-label ReliefF and F-statistic feature selections for image annotation [C]// Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition. Washington, DC: IEEE Computer Society, 2012: 2352-2359. 10.1109/cvpr.2012.6247947 |
5 | GRAUMAN K, SHA F, HWANG S. Learning a tree of metrics with disjoint visual features [J]. Advances in Neural Information Processing Systems, 2011, 24: 621-629. |
6 | HWANG S J, GRAUMAN K, SHA F. Semantic kernel forests from multiple taxonomies [C]// Proceedings of the 2012 Advances in Neural Information Processing Systems. Red Hook, NY: Curran Associates Inc., 2012: 1718-1726. |
7 | ZHAO H, ZHU P, WANG P, et al. Hierarchical feature selection with recursive regularization [C]// Proceedings of the 2017 International Joint Conference on Artificial Intelligence. Palo Alto: AAAI, 2017: 3483-3489. 10.24963/ijcai.2017/487 |
8 | WU X, YU K, DING W, et al. Online feature selection with streaming features [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 35(5): 1178-1192. 10.1109/tpami.2012.197 |
9 | 陈祥焰, 林耀进, 王晨曦 .基于邻域粗糙集的高维类不平衡数据在线流特征选择 [J].模式识别与人工智能, 2019, 32(8): 726-735. 10.16451/j.cnki.issn1003-6059.201908006 |
CHEN X Y, LIN Y J, WANG C X. Online streaming feature selection for high-dimensional class imbalance data based on neighborhood rough sets [J]. Pattern Recognition and Artificial Intelligence, 2019,32(8) : 726-735. 10.16451/j.cnki.issn1003-6059.201908006 | |
10 | LIU J H, LIN M L, WANG C X, et al. Multi-label feature selection algorithm based on local subspace [J]. Pattern Recognition and Artificial Intelligence, 2016, 29(3): 240-251. |
11 | LIN Y J, HU Q H, LIU J H, et al. Streaming feature selection for multi-label learning based on fuzzy mutual information [J]. IEEE Transactions on Fuzzy Systems, 2017, 25(6): 1491-1507. 10.1109/tfuzz.2017.2735947 |
12 | LIU J H, LIN Y J, LI Y W, et al. Online multi-label streaming feature selection based on neighborhood rough set [J]. Pattern Recognition, 2018, 84: 273-287. 10.1016/j.patcog.2018.07.021 |
13 | WU F H, ZHANG J, HONAVAR V. Learning classifiers using hierarchically structured class taxonomies [C]// Proceedings of the 2005 International Symposium on Abstraction, Reformulation, and Approximation, LNCS 3607. Berlin: Springer, 2005: 313-320. |
14 | SILLA C N, FREITAS A A. A survey of hierarchical classification across different application domains [J]. Data Mining and Knowledge Discovery, 2011, 22(1): 31-72. 10.1007/s10618-010-0175-9 |
15 | 胡清华,于达仁,谢宗霞.基于邻域粒化和粗糙逼近的数值属性约简[J].软件学报,2008, 19(3): 640-649. 10.3724/SP.J.1001.2008.00640 |
HU Q H, YU D R, XIE Z X. Numerical attribute reduction based on neighborhood granulation and rough approximation [J]. Journal of Software, 2008, 19(3): 640-649. 10.3724/SP.J.1001.2008.00640 | |
16 | ZHOU P, HU X, LI P, et al. OFS-Density: a novel online streaming feature selection method [J]. Pattern Recognition, 2019, 86(2): 48-61. 10.1016/j.patcog.2018.08.009 |
17 | YU K, WU X, DING W, et al. Scalable and accurate online feature selection for big data [J]. ACM Transactions on Knowledge Discovery from Data, 2016, 11(2): 1-39. 10.1145/2976744 |
18 | ZHOU P, HU X G, LI P P, et al. Online streaming feature selection using adapted neighborhood rough set[J]. Information Sciences, 2019, 481:258-279. 10.1016/j.ins.2018.12.074 |
[1] | Hong CHEN, Bing QI, Haibo JIN, Cong WU, Li’ang ZHANG. Class-imbalanced traffic abnormal detection based on 1D-CNN and BiGRU [J]. Journal of Computer Applications, 2024, 44(8): 2493-2499. |
[2] | Lin GAO, Yu ZHOU, Tak Wu KWONG. Evolutionary bi-level adaptive local feature selection [J]. Journal of Computer Applications, 2024, 44(5): 1408-1414. |
[3] | Mingzhu LEI, Hao WANG, Rong JIA, Lin BAI, Xiaoying PAN. Oversampling algorithm based on synthesizing minority class samples using relationship between features [J]. Journal of Computer Applications, 2024, 44(5): 1428-1436. |
[4] | Lin SUN, Menghan LIU. K-means clustering based on adaptive cuckoo optimization feature selection [J]. Journal of Computer Applications, 2024, 44(3): 831-841. |
[5] | Dapeng XU, Xinmin HOU. Feature selection method for graph neural network based on network architecture design [J]. Journal of Computer Applications, 2024, 44(3): 663-670. |
[6] | Shengjie MENG, Wanjun YU, Ying CHEN. Feature selection algorithm for high-dimensional data with maximum correlation and maximum difference [J]. Journal of Computer Applications, 2024, 44(3): 767-771. |
[7] | Zongze JIA, Pengfei GAO, Yinglong MA, Xiaofeng LIU, Haixin XIA. Multi-feature fusion attention-based hierarchical classification method for dialogue act [J]. Journal of Computer Applications, 2024, 44(3): 715-721. |
[8] | Jingxin LIU, Wenjing HUANG, Liangsheng XU, Chong HUANG, Jiansheng WU. Unsupervised feature selection model with dictionary learning and sample correlation preservation [J]. Journal of Computer Applications, 2024, 44(12): 3766-3775. |
[9] | Tian HE, Zongxin SHEN, Qianqian HUANG, Yanyong HUANG. Adaptive learning-based multi-view unsupervised feature selection method [J]. Journal of Computer Applications, 2023, 43(9): 2657-2664. |
[10] | Lin SUN, Jinxu HUANG, Jiucheng XU. Feature selection for imbalanced data based on neighborhood tolerance mutual information and whale optimization algorithm [J]. Journal of Computer Applications, 2023, 43(6): 1842-1854. |
[11] | Zhenhua YU, Zhengqi LIU, Ying LIU, Cheng GUO. Feature selection method based on self-adaptive hybrid particle swarm optimization for software defect prediction [J]. Journal of Computer Applications, 2023, 43(4): 1206-1213. |
[12] | Lin SUN, Tianjiao MA, Zhan’ao XUE. Multilabel feature selection algorithm based on Fisher score and fuzzy neighborhood entropy [J]. Journal of Computer Applications, 2023, 43(12): 3779-3789. |
[13] | Jingcheng XU, Xuebin CHEN, Yanling DONG, Jia YANG. DDoS attack detection by random forest fused with feature selection [J]. Journal of Computer Applications, 2023, 43(11): 3497-3503. |
[14] | Lei MA, Chuan LUO, Tianrui LI, Hongmei CHEN. Fuzzy-rough set based unsupervised dynamic feature selection algorithm [J]. Journal of Computer Applications, 2023, 43(10): 3121-3128. |
[15] | Liang CHEN, Xianfeng TANG. Improved sine cosine algorithm for optimizing feature selection and data classification [J]. Journal of Computer Applications, 2022, 42(6): 1852-1861. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||