Journal of Computer Applications ›› 2022, Vol. 42 ›› Issue (10): 3046-3053.DOI: 10.11772/j.issn.1001-9081.2021081486
Special Issue: 人工智能
• Artificial intelligence • Previous Articles Next Articles
Qian GE, Guangbin ZHANG, Xiaofeng ZHANG
Received:
2021-08-19
Revised:
2021-11-20
Accepted:
2021-11-21
Online:
2022-01-07
Published:
2022-10-10
Contact:
Guangbin ZHANG
About author:
GE Qian, born in 1995, M. S. candidate. Her research interests include signal processing.葛倩, 张光斌, 张小凤
通讯作者:
张光斌
作者简介:
第一联系人:葛倩(1995—),女,山东滕州人,硕士研究生,主要研究方向:信号处理CLC Number:
Qian GE, Guangbin ZHANG, Xiaofeng ZHANG. Automatic feature selection algorithm based on interaction of ReliefF with maximum information coefficient and SVM[J]. Journal of Computer Applications, 2022, 42(10): 3046-3053.
葛倩, 张光斌, 张小凤. 基于最大信息系数的ReliefF和支持向量机交互的自动特征选择算法[J]. 《计算机应用》唯一官方网站, 2022, 42(10): 3046-3053.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2021081486
数据集 | 总样本数 | 特征数 | 训练集样本数 | 测试集样本数 |
---|---|---|---|---|
WDBC | 569 | 30 | 341 | 228 |
Ionosphere | 351 | 34 | 211 | 140 |
Horse Colic | 368 | 25 | 221 | 147 |
Mushroom | 8 124 | 22 | 4 874 | 3 250 |
Parkinsons | 195 | 22 | 117 | 78 |
Connectionist Bench | 208 | 60 | 125 | 83 |
Musk | 476 | 166 | 286 | 190 |
Tab. 1 Information of experimental datasets
数据集 | 总样本数 | 特征数 | 训练集样本数 | 测试集样本数 |
---|---|---|---|---|
WDBC | 569 | 30 | 341 | 228 |
Ionosphere | 351 | 34 | 211 | 140 |
Horse Colic | 368 | 25 | 221 | 147 |
Mushroom | 8 124 | 22 | 4 874 | 3 250 |
Parkinsons | 195 | 22 | 117 | 78 |
Connectionist Bench | 208 | 60 | 125 | 83 |
Musk | 476 | 166 | 286 | 190 |
数据集 | 原始特征数 | 筛选后的特征数 | |
---|---|---|---|
ReliefF-SVM算法 | MICReliefF-SVM算法 | ||
WDBC | 30 | 19 | 16 |
Ionosphere | 34 | 16 | 6 |
Horse Colic | 25 | 17 | 12 |
Mushroom | 22 | 18 | 11 |
Parkinsons | 22 | 7 | 3 |
Connectionist Bench | 60 | 7 | 7 |
Musk | 166 | 49 | 35 |
Tab. 2 Comparison of feature filtering results of different algorithms
数据集 | 原始特征数 | 筛选后的特征数 | |
---|---|---|---|
ReliefF-SVM算法 | MICReliefF-SVM算法 | ||
WDBC | 30 | 19 | 16 |
Ionosphere | 34 | 16 | 6 |
Horse Colic | 25 | 17 | 12 |
Mushroom | 22 | 18 | 11 |
Parkinsons | 22 | 7 | 3 |
Connectionist Bench | 60 | 7 | 7 |
Musk | 166 | 49 | 35 |
实际 | 预测 | |
---|---|---|
正类 | 负类 | |
正类 | TP(True Positive) | FN(False Negative) |
负类 | FP(False Positive) | TN(True Negative) |
Tab. 3 Confusion matrix
实际 | 预测 | |
---|---|---|
正类 | 负类 | |
正类 | TP(True Positive) | FN(False Negative) |
负类 | FP(False Positive) | TN(True Negative) |
数据集 | 评价指标 | 原始数据集的全部特征 | ReliefF-SVM算法选取的特征子集 | MICReliefF-SVM算法选取的特征子集 | |||
---|---|---|---|---|---|---|---|
平均值 | 标准差 | 平均值 | 标准差 | 平均值 | 标准差 | ||
WDBC | Accuracy | 0.943 7 | 0.000 2 | 0.946 5 | 0.000 1 | 0.951 0 | 0.000 1 |
Sensitivity | 0.854 2 | 0.001 1 | 0.868 5 | 0.000 9 | 0.880 9 | 0.000 8 | |
Specificity | 0.997 5 | 0.000 2 | 0.993 8 | 0.000 5 | 0.994 0 | 0.000 5 | |
Precision | 0.994 8 | 0.001 0 | 0.988 0 | 0.000 2 | 0.988 4 | 0.000 2 | |
F值 | 0.918 8 | 0.000 4 | 0.924 1 | 0.000 3 | 0.928 9 | 0.000 3 | |
Ionosphere | Accuracy | 0.840 3 | 0.001 3 | 0.859 4 | 0.001 1 | 0.862 7 | 0.000 6 |
Sensitivity | 0.997 8 | 0.000 2 | 0.997 1 | 0.000 3 | 0.983 0 | 0.000 3 | |
Specificity | 0.568 4 | 0.005 9 | 0.616 3 | 0.005 0 | 0.643 0 | 0.003 0 | |
Precision | 0.801 0 | 0.001 9 | 0.822 6 | 0.001 8 | 0.835 3 | 0.001 0 | |
F值 | 0.888 0 | 0.000 7 | 0.900 8 | 0.000 6 | 0.902 8 | 0.000 3 | |
Horse Colic | Accuracy | 0.807 4 | 0.001 4 | 0.834 6 | 0.001 0 | 0.843 1 | 0.000 7 |
Sensitivity | 0.665 9 | 0.008 0 | 0.736 8 | 0.004 3 | 0.746 8 | 0.003 6 | |
Specificity | 0.893 1 | 0.002 3 | 0.894 2 | 0.001 8 | 0.899 8 | 0.001 6 | |
Precision | 0.785 4 | 0.006 3 | 0.807 6 | 0.004 7 | 0.815 5 | 0.003 3 | |
F值 | 0.710 7 | 0.003 2 | 0.767 3 | 0.001 9 | 0.776 7 | 0.001 7 | |
Mushroom | Accuracy | 0.956 8 | 0.000 0 | 0.962 9 | 0.000 0 | 0.963 0 | 0.000 0 |
Sensitivity | 0.951 6 | 0.000 0 | 0.963 9 | 0.000 0 | 0.965 7 | 0.000 0 | |
Specificity | 0.961 6 | 0.000 0 | 0.962 0 | 0.000 0 | 0.961 7 | 0.000 0 | |
Precision | 0.958 4 | 0.000 0 | 0.959 3 | 0.000 0 | 0.959 1 | 0.000 0 | |
F值 | 0.954 9 | 0.000 0 | 0.961 1 | 0.000 0 | 0.962 4 | 0.000 0 | |
Parkinsons | Accuracy | 0.755 5 | 0.003 3 | 0.822 1 | 0.002 5 | 0.838 2 | 0.001 8 |
Sensitivity | 0.999 7 | 0.000 5 | 0.996 7 | 0.000 1 | 0.992 7 | 0.000 1 | |
Specificity | 0.021 5 | 0.009 9 | 0.306 9 | 0.020 6 | 0.343 0 | 0.010 5 | |
Precision | 0.756 1 | 0.002 5 | 0.811 9 | 0.001 4 | 0.818 8 | 0.001 2 | |
F值 | 0.860 1 | 0.009 0 | 0.893 7 | 0.001 0 | 0.896 4 | 0.000 5 | |
Connectionist Bench | Accuracy | 0.670 5 | 0.004 9 | 0.727 3 | 0.002 8 | 0.783 4 | 0.001 4 |
Sensitivity | 0.816 5 | 0.022 7 | 0.788 4 | 0.006 2 | 0.852 0 | 0.005 5 | |
Specificity | 0.532 8 | 0.045 6 | 0.688 0 | 0.010 9 | 0.715 8 | 0.007 4 | |
Precision | 0.691 6 | 0.011 7 | 0.726 1 | 0.006 8 | 0.772 5 | 0.004 5 | |
F值 | 0.723 2 | 0.003 1 | 0.751 4 | 0.002 4 | 0.805 2 | 0.001 2 | |
Musk | Accuracy | 0.655 3 | 0.004 5 | 0.719 4 | 0.001 8 | 0.738 8 | 0.001 6 |
Sensitivity | 0.925 8 | 0.008 3 | 0.877 2 | 0.003 1 | 0.905 3 | 0.002 8 | |
Specificity | 0.323 0 | 0.043 0 | 0.524 9 | 0.009 5 | 0.527 8 | 0.009 7 | |
Precision | 0.645 5 | 0.006 3 | 0.703 9 | 0.002 9 | 0.716 0 | 0.002 8 | |
F值 | 0.752 4 | 0.001 3 | 0.777 9 | 0.000 9 | 0.796 9 | 0.000 7 |
Tab. 4 Mean and standard deviation comparison of evaluation indexes among each feature selection algorithms in SVM model
数据集 | 评价指标 | 原始数据集的全部特征 | ReliefF-SVM算法选取的特征子集 | MICReliefF-SVM算法选取的特征子集 | |||
---|---|---|---|---|---|---|---|
平均值 | 标准差 | 平均值 | 标准差 | 平均值 | 标准差 | ||
WDBC | Accuracy | 0.943 7 | 0.000 2 | 0.946 5 | 0.000 1 | 0.951 0 | 0.000 1 |
Sensitivity | 0.854 2 | 0.001 1 | 0.868 5 | 0.000 9 | 0.880 9 | 0.000 8 | |
Specificity | 0.997 5 | 0.000 2 | 0.993 8 | 0.000 5 | 0.994 0 | 0.000 5 | |
Precision | 0.994 8 | 0.001 0 | 0.988 0 | 0.000 2 | 0.988 4 | 0.000 2 | |
F值 | 0.918 8 | 0.000 4 | 0.924 1 | 0.000 3 | 0.928 9 | 0.000 3 | |
Ionosphere | Accuracy | 0.840 3 | 0.001 3 | 0.859 4 | 0.001 1 | 0.862 7 | 0.000 6 |
Sensitivity | 0.997 8 | 0.000 2 | 0.997 1 | 0.000 3 | 0.983 0 | 0.000 3 | |
Specificity | 0.568 4 | 0.005 9 | 0.616 3 | 0.005 0 | 0.643 0 | 0.003 0 | |
Precision | 0.801 0 | 0.001 9 | 0.822 6 | 0.001 8 | 0.835 3 | 0.001 0 | |
F值 | 0.888 0 | 0.000 7 | 0.900 8 | 0.000 6 | 0.902 8 | 0.000 3 | |
Horse Colic | Accuracy | 0.807 4 | 0.001 4 | 0.834 6 | 0.001 0 | 0.843 1 | 0.000 7 |
Sensitivity | 0.665 9 | 0.008 0 | 0.736 8 | 0.004 3 | 0.746 8 | 0.003 6 | |
Specificity | 0.893 1 | 0.002 3 | 0.894 2 | 0.001 8 | 0.899 8 | 0.001 6 | |
Precision | 0.785 4 | 0.006 3 | 0.807 6 | 0.004 7 | 0.815 5 | 0.003 3 | |
F值 | 0.710 7 | 0.003 2 | 0.767 3 | 0.001 9 | 0.776 7 | 0.001 7 | |
Mushroom | Accuracy | 0.956 8 | 0.000 0 | 0.962 9 | 0.000 0 | 0.963 0 | 0.000 0 |
Sensitivity | 0.951 6 | 0.000 0 | 0.963 9 | 0.000 0 | 0.965 7 | 0.000 0 | |
Specificity | 0.961 6 | 0.000 0 | 0.962 0 | 0.000 0 | 0.961 7 | 0.000 0 | |
Precision | 0.958 4 | 0.000 0 | 0.959 3 | 0.000 0 | 0.959 1 | 0.000 0 | |
F值 | 0.954 9 | 0.000 0 | 0.961 1 | 0.000 0 | 0.962 4 | 0.000 0 | |
Parkinsons | Accuracy | 0.755 5 | 0.003 3 | 0.822 1 | 0.002 5 | 0.838 2 | 0.001 8 |
Sensitivity | 0.999 7 | 0.000 5 | 0.996 7 | 0.000 1 | 0.992 7 | 0.000 1 | |
Specificity | 0.021 5 | 0.009 9 | 0.306 9 | 0.020 6 | 0.343 0 | 0.010 5 | |
Precision | 0.756 1 | 0.002 5 | 0.811 9 | 0.001 4 | 0.818 8 | 0.001 2 | |
F值 | 0.860 1 | 0.009 0 | 0.893 7 | 0.001 0 | 0.896 4 | 0.000 5 | |
Connectionist Bench | Accuracy | 0.670 5 | 0.004 9 | 0.727 3 | 0.002 8 | 0.783 4 | 0.001 4 |
Sensitivity | 0.816 5 | 0.022 7 | 0.788 4 | 0.006 2 | 0.852 0 | 0.005 5 | |
Specificity | 0.532 8 | 0.045 6 | 0.688 0 | 0.010 9 | 0.715 8 | 0.007 4 | |
Precision | 0.691 6 | 0.011 7 | 0.726 1 | 0.006 8 | 0.772 5 | 0.004 5 | |
F值 | 0.723 2 | 0.003 1 | 0.751 4 | 0.002 4 | 0.805 2 | 0.001 2 | |
Musk | Accuracy | 0.655 3 | 0.004 5 | 0.719 4 | 0.001 8 | 0.738 8 | 0.001 6 |
Sensitivity | 0.925 8 | 0.008 3 | 0.877 2 | 0.003 1 | 0.905 3 | 0.002 8 | |
Specificity | 0.323 0 | 0.043 0 | 0.524 9 | 0.009 5 | 0.527 8 | 0.009 7 | |
Precision | 0.645 5 | 0.006 3 | 0.703 9 | 0.002 9 | 0.716 0 | 0.002 8 | |
F值 | 0.752 4 | 0.001 3 | 0.777 9 | 0.000 9 | 0.796 9 | 0.000 7 |
数据集 | 评价指标 | 原始数据集的全部特征 | ReliefF-SVM算法选取的特征子集 | MICReliefF-SVM算法选取的特征子集 | |||
---|---|---|---|---|---|---|---|
平均值 | 标准差 | 平均值 | 标准差 | 平均值 | 标准差 | ||
WDBC | Accuracy | 0.956 4 | 0.000 2 | 0.958 2 | 0.000 9 | 0.958 6 | 0.000 7 |
Sensitivity | 0.906 2 | 0.000 8 | 0.906 7 | 0.000 6 | 0.908 2 | 0.000 5 | |
Specificity | 0.986 1 | 0.000 1 | 0.989 0 | 0.000 5 | 0.988 3 | 0.000 4 | |
Precision | 0.974 8 | 0.000 3 | 0.980 0 | 0.000 2 | 0.979 0 | 0.000 1 | |
F值 | 0.938 8 | 0.000 3 | 0.941 6 | 0.000 3 | 0.941 8 | 0.000 2 | |
Ionosphere | Accuracy | 0.854 1 | 0.001 4 | 0.872 8 | 0.001 0 | 0.889 5 | 0.000 8 |
Sensitivity | 0.983 3 | 0.000 5 | 0.985 4 | 0.000 2 | 0.973 9 | 0.000 4 | |
Specificity | 0.623 3 | 0.007 5 | 0.670 0 | 0.005 8 | 0.740 8 | 0.005 6 | |
Precision | 0.824 7 | 0.001 9 | 0.844 3 | 0.001 5 | 0.869 8 | 0.001 3 | |
F值 | 0.896 4 | 0.000 7 | 0.908 9 | 0.000 6 | 0.918 4 | 0.000 4 | |
Horse Colic | Accuracy | 0.789 6 | 0.001 4 | 0.822 0 | 0.000 8 | 0.836 1 | 0.000 7 |
Sensitivity | 0.685 7 | 0.005 8 | 0.753 2 | 0.004 1 | 0.767 8 | 0.002 1 | |
Specificity | 0.854 5 | 0.001 8 | 0.863 2 | 0.001 5 | 0.876 9 | 0.001 4 | |
Precision | 0.745 2 | 0.004 4 | 0.764 9 | 0.003 3 | 0.787 2 | 0.003 1 | |
F值 | 0.711 3 | 0.002 9 | 0.757 1 | 0.001 8 | 0.775 3 | 0.001 4 | |
Mushroom | Accuracy | 0.922 4 | 0.000 2 | 0.927 2 | 0.000 2 | 0.936 3 | 0.000 1 |
Sensitivity | 0.900 1 | 0.000 8 | 0.904 8 | 0.000 5 | 0.931 0 | 0.000 3 | |
Specificity | 0.943 3 | 0.000 3 | 0.948 0 | 0.000 2 | 0.941 3 | 0.000 2 | |
Precision | 0.936 7 | 0.000 3 | 0.941 8 | 0.000 3 | 0.936 6 | 0.000 2 | |
F值 | 0.917 8 | 0.000 3 | 0.922 8 | 0.000 2 | 0.933 6 | 0.000 2 | |
Parkinsons | Accuracy | 0.841 3 | 0.001 4 | 0.856 3 | 0.001 1 | 0.870 6 | 0.001 0 |
Sensitivity | 0.932 8 | 0.002 5 | 0.939 1 | 0.001 6 | 0.951 5 | 0.000 9 | |
Specificity | 0.564 8 | 0.013 5 | 0.614 6 | 0.011 3 | 0.627 0 | 0.010 8 | |
Precision | 0.868 4 | 0.001 7 | 0.879 4 | 0.001 6 | 0.887 4 | 0.001 5 | |
F值 | 0.898 4 | 0.000 8 | 0.907 1 | 0.000 6 | 0.917 2 | 0.000 5 | |
Connectionist Bench | Accuracy | 0.714 5 | 0.002 6 | 0.754 9 | 0.001 8 | 0.780 5 | 0.001 6 |
Sensitivity | 0.746 1 | 0.006 3 | 0.818 0 | 0.004 2 | 0.830 7 | 0.003 9 | |
Specificity | 0.680 3 | 0.007 4 | 0.688 5 | 0.006 5 | 0.728 7 | 0.005 0 | |
Precision | 0.732 6 | 0.004 7 | 0.747 8 | 0.003 6 | 0.772 7 | 0.002 6 | |
F值 | 0.735 5 | 0.002 7 | 0.778 4 | 0.001 7 | 0.798 1 | 0.001 4 | |
Musk | Accuracy | 0.693 1 | 0.001 4 | 0.723 1 | 0.001 1 | 0.735 1 | 0.000 8 |
Sensitivity | 0.626 3 | 0.005 1 | 0.648 4 | 0.004 8 | 0.649 9 | 0.003 5 | |
Specificity | 0.749 3 | 0.003 2 | 0.783 3 | 0.002 6 | 0.801 3 | 0.002 4 | |
Precision | 0.660 2 | 0.003 7 | 0.698 5 | 0.003 3 | 0.715 3 | 0.003 1 | |
F值 | 0.638 0 | 0.002 1 | 0.669 1 | 0.002 0 | 0.678 1 | 0.001 4 |
Tab. 5 Mean and standard deviation comparison of evaluation indexes among each feature selection algorithms in ELM model
数据集 | 评价指标 | 原始数据集的全部特征 | ReliefF-SVM算法选取的特征子集 | MICReliefF-SVM算法选取的特征子集 | |||
---|---|---|---|---|---|---|---|
平均值 | 标准差 | 平均值 | 标准差 | 平均值 | 标准差 | ||
WDBC | Accuracy | 0.956 4 | 0.000 2 | 0.958 2 | 0.000 9 | 0.958 6 | 0.000 7 |
Sensitivity | 0.906 2 | 0.000 8 | 0.906 7 | 0.000 6 | 0.908 2 | 0.000 5 | |
Specificity | 0.986 1 | 0.000 1 | 0.989 0 | 0.000 5 | 0.988 3 | 0.000 4 | |
Precision | 0.974 8 | 0.000 3 | 0.980 0 | 0.000 2 | 0.979 0 | 0.000 1 | |
F值 | 0.938 8 | 0.000 3 | 0.941 6 | 0.000 3 | 0.941 8 | 0.000 2 | |
Ionosphere | Accuracy | 0.854 1 | 0.001 4 | 0.872 8 | 0.001 0 | 0.889 5 | 0.000 8 |
Sensitivity | 0.983 3 | 0.000 5 | 0.985 4 | 0.000 2 | 0.973 9 | 0.000 4 | |
Specificity | 0.623 3 | 0.007 5 | 0.670 0 | 0.005 8 | 0.740 8 | 0.005 6 | |
Precision | 0.824 7 | 0.001 9 | 0.844 3 | 0.001 5 | 0.869 8 | 0.001 3 | |
F值 | 0.896 4 | 0.000 7 | 0.908 9 | 0.000 6 | 0.918 4 | 0.000 4 | |
Horse Colic | Accuracy | 0.789 6 | 0.001 4 | 0.822 0 | 0.000 8 | 0.836 1 | 0.000 7 |
Sensitivity | 0.685 7 | 0.005 8 | 0.753 2 | 0.004 1 | 0.767 8 | 0.002 1 | |
Specificity | 0.854 5 | 0.001 8 | 0.863 2 | 0.001 5 | 0.876 9 | 0.001 4 | |
Precision | 0.745 2 | 0.004 4 | 0.764 9 | 0.003 3 | 0.787 2 | 0.003 1 | |
F值 | 0.711 3 | 0.002 9 | 0.757 1 | 0.001 8 | 0.775 3 | 0.001 4 | |
Mushroom | Accuracy | 0.922 4 | 0.000 2 | 0.927 2 | 0.000 2 | 0.936 3 | 0.000 1 |
Sensitivity | 0.900 1 | 0.000 8 | 0.904 8 | 0.000 5 | 0.931 0 | 0.000 3 | |
Specificity | 0.943 3 | 0.000 3 | 0.948 0 | 0.000 2 | 0.941 3 | 0.000 2 | |
Precision | 0.936 7 | 0.000 3 | 0.941 8 | 0.000 3 | 0.936 6 | 0.000 2 | |
F值 | 0.917 8 | 0.000 3 | 0.922 8 | 0.000 2 | 0.933 6 | 0.000 2 | |
Parkinsons | Accuracy | 0.841 3 | 0.001 4 | 0.856 3 | 0.001 1 | 0.870 6 | 0.001 0 |
Sensitivity | 0.932 8 | 0.002 5 | 0.939 1 | 0.001 6 | 0.951 5 | 0.000 9 | |
Specificity | 0.564 8 | 0.013 5 | 0.614 6 | 0.011 3 | 0.627 0 | 0.010 8 | |
Precision | 0.868 4 | 0.001 7 | 0.879 4 | 0.001 6 | 0.887 4 | 0.001 5 | |
F值 | 0.898 4 | 0.000 8 | 0.907 1 | 0.000 6 | 0.917 2 | 0.000 5 | |
Connectionist Bench | Accuracy | 0.714 5 | 0.002 6 | 0.754 9 | 0.001 8 | 0.780 5 | 0.001 6 |
Sensitivity | 0.746 1 | 0.006 3 | 0.818 0 | 0.004 2 | 0.830 7 | 0.003 9 | |
Specificity | 0.680 3 | 0.007 4 | 0.688 5 | 0.006 5 | 0.728 7 | 0.005 0 | |
Precision | 0.732 6 | 0.004 7 | 0.747 8 | 0.003 6 | 0.772 7 | 0.002 6 | |
F值 | 0.735 5 | 0.002 7 | 0.778 4 | 0.001 7 | 0.798 1 | 0.001 4 | |
Musk | Accuracy | 0.693 1 | 0.001 4 | 0.723 1 | 0.001 1 | 0.735 1 | 0.000 8 |
Sensitivity | 0.626 3 | 0.005 1 | 0.648 4 | 0.004 8 | 0.649 9 | 0.003 5 | |
Specificity | 0.749 3 | 0.003 2 | 0.783 3 | 0.002 6 | 0.801 3 | 0.002 4 | |
Precision | 0.660 2 | 0.003 7 | 0.698 5 | 0.003 3 | 0.715 3 | 0.003 1 | |
F值 | 0.638 0 | 0.002 1 | 0.669 1 | 0.002 0 | 0.678 1 | 0.001 4 |
数据集 | SVM | ELM | ||||
---|---|---|---|---|---|---|
原始数据集的全部特征 | ReliefF-SVM算法选取的特征子集 | MICReliefF-SVM算法选取的特征子集 | 原始数据集的全部特征 | ReliefF-SVM算法选取的特征子集 | MICReliefF-SVM算法选取的特征子集 | |
WDBC | 1.351 9 | 1.027 8 | 0.922 0 | 0.704 6 | 0.688 8 | 0.638 4 |
Ionosphere | 1.030 7 | 0.683 3 | 0.513 4 | 0.647 7 | 0.640 6 | 0.616 3 |
Horse Colic | 1.256 2 | 0.907 0 | 0.743 8 | 0.657 2 | 0.635 4 | 0.632 5 |
Mushroom | 93.026 8 | 75.305 9 | 57.023 4 | 2.110 4 | 2.094 5 | 2.051 1 |
Parkinsons | 0.433 4 | 0.363 0 | 0.346 4 | 0.575 2 | 0.557 2 | 0.555 7 |
Connectionist Bench | 0.869 4 | 0.415 8 | 0.412 0 | 0.612 4 | 0.560 7 | 0.558 2 |
Musk | 7.013 3 | 2.828 6 | 2.109 6 | 0.762 9 | 0.706 0 | 0.695 2 |
Tab. 6 Comparison of running time among different feature selection algorithms in SVM model and ELM model
数据集 | SVM | ELM | ||||
---|---|---|---|---|---|---|
原始数据集的全部特征 | ReliefF-SVM算法选取的特征子集 | MICReliefF-SVM算法选取的特征子集 | 原始数据集的全部特征 | ReliefF-SVM算法选取的特征子集 | MICReliefF-SVM算法选取的特征子集 | |
WDBC | 1.351 9 | 1.027 8 | 0.922 0 | 0.704 6 | 0.688 8 | 0.638 4 |
Ionosphere | 1.030 7 | 0.683 3 | 0.513 4 | 0.647 7 | 0.640 6 | 0.616 3 |
Horse Colic | 1.256 2 | 0.907 0 | 0.743 8 | 0.657 2 | 0.635 4 | 0.632 5 |
Mushroom | 93.026 8 | 75.305 9 | 57.023 4 | 2.110 4 | 2.094 5 | 2.051 1 |
Parkinsons | 0.433 4 | 0.363 0 | 0.346 4 | 0.575 2 | 0.557 2 | 0.555 7 |
Connectionist Bench | 0.869 4 | 0.415 8 | 0.412 0 | 0.612 4 | 0.560 7 | 0.558 2 |
Musk | 7.013 3 | 2.828 6 | 2.109 6 | 0.762 9 | 0.706 0 | 0.695 2 |
数据集 | 分类准确率 | ||||||
---|---|---|---|---|---|---|---|
MI | mRMR | SVM-RFE | CFS | RF | GA | MICReliefF-SVM | |
WDBC | 0.938 9 | 0.948 2 | 0.944 4 | 0.940 7 | 0.939 1 | 0.946 5 | 0.951 0 |
Ionosphere | 0.796 5 | 0.849 6 | 0.825 4 | 0.848 8 | 0.858 4 | 0.826 8 | 0.862 7 |
Horse Colic | 0.820 2 | 0.841 4 | 0.827 1 | 0.820 1 | 0.810 8 | 0.646 9 | 0.843 1 |
Mushroom | 0.961 9 | 0.955 3 | 0.940 9 | 0.929 2 | 0.976 0 | 0.921 8 | 0.963 0 |
Parkinsons | 0.798 1 | 0.829 5 | 0.762 6 | 0.783 5 | 0.761 5 | 0.749 0 | 0.838 2 |
Connectionist Bench | 0.748 7 | 0.765 4 | 0.615 7 | 0.771 8 | 0.780 2 | 0.675 4 | 0.783 4 |
Musk | 0.730 9 | 0.724 9 | 0.563 7 | 0.728 2 | 0.724 4 | 0.649 4 | 0.738 8 |
Tab. 7 Comparison of classification accuracy among different feature selection algorithms in SVM model
数据集 | 分类准确率 | ||||||
---|---|---|---|---|---|---|---|
MI | mRMR | SVM-RFE | CFS | RF | GA | MICReliefF-SVM | |
WDBC | 0.938 9 | 0.948 2 | 0.944 4 | 0.940 7 | 0.939 1 | 0.946 5 | 0.951 0 |
Ionosphere | 0.796 5 | 0.849 6 | 0.825 4 | 0.848 8 | 0.858 4 | 0.826 8 | 0.862 7 |
Horse Colic | 0.820 2 | 0.841 4 | 0.827 1 | 0.820 1 | 0.810 8 | 0.646 9 | 0.843 1 |
Mushroom | 0.961 9 | 0.955 3 | 0.940 9 | 0.929 2 | 0.976 0 | 0.921 8 | 0.963 0 |
Parkinsons | 0.798 1 | 0.829 5 | 0.762 6 | 0.783 5 | 0.761 5 | 0.749 0 | 0.838 2 |
Connectionist Bench | 0.748 7 | 0.765 4 | 0.615 7 | 0.771 8 | 0.780 2 | 0.675 4 | 0.783 4 |
Musk | 0.730 9 | 0.724 9 | 0.563 7 | 0.728 2 | 0.724 4 | 0.649 4 | 0.738 8 |
数据集 | 分类准确率 | ||||||
---|---|---|---|---|---|---|---|
MI | mRMR | SVM-RFE | CFS | RF | GA | MICReliefF-SVM | |
WDBC | 0.945 5 | 0.955 1 | 0.951 7 | 0.932 1 | 0.951 0 | 0.956 5 | 0.958 6 |
Ionosphere | 0.872 4 | 0.865 4 | 0.854 8 | 0.725 1 | 0.881 6 | 0.856 1 | 0.889 5 |
Horse Colic | 0.673 5 | 0.828 7 | 0.823 5 | 0.833 7 | 0.823 5 | 0.670 3 | 0.836 1 |
Mushroom | 0.928 4 | 0.920 3 | 0.901 9 | 0.928 9 | 0.948 7 | 0.910 5 | 0.936 3 |
Parkinsons | 0.833 0 | 0.844 5 | 0.851 0 | 0.865 8 | 0.852 6 | 0.830 1 | 0.870 6 |
Connectionist Bench | 0.734 6 | 0.757 8 | 0.744 1 | 0.745 4 | 0.756 4 | 0.715 1 | 0.780 5 |
Musk | 0.726 3 | 0.715 4 | 0.730 9 | 0.729 2 | 0.738 4 | 0.692 7 | 0.734 6 |
Tab. 8 Comparison of classification accuracy among different feature selection algorithms in ELM model
数据集 | 分类准确率 | ||||||
---|---|---|---|---|---|---|---|
MI | mRMR | SVM-RFE | CFS | RF | GA | MICReliefF-SVM | |
WDBC | 0.945 5 | 0.955 1 | 0.951 7 | 0.932 1 | 0.951 0 | 0.956 5 | 0.958 6 |
Ionosphere | 0.872 4 | 0.865 4 | 0.854 8 | 0.725 1 | 0.881 6 | 0.856 1 | 0.889 5 |
Horse Colic | 0.673 5 | 0.828 7 | 0.823 5 | 0.833 7 | 0.823 5 | 0.670 3 | 0.836 1 |
Mushroom | 0.928 4 | 0.920 3 | 0.901 9 | 0.928 9 | 0.948 7 | 0.910 5 | 0.936 3 |
Parkinsons | 0.833 0 | 0.844 5 | 0.851 0 | 0.865 8 | 0.852 6 | 0.830 1 | 0.870 6 |
Connectionist Bench | 0.734 6 | 0.757 8 | 0.744 1 | 0.745 4 | 0.756 4 | 0.715 1 | 0.780 5 |
Musk | 0.726 3 | 0.715 4 | 0.730 9 | 0.729 2 | 0.738 4 | 0.692 7 | 0.734 6 |
1 | WANG Z, ZHANG Y, CHEN Z C, et al. Application of ReliefF algorithm to selecting feature sets for classification of high resolution remote sensing image[C]// Proceedings of the 2016 IEEE International Geoscience and Remote Sensing Symposium. Piscataway: IEEE, 2016: 755-758. 10.1109/igarss.2016.7729190 |
2 | 何沙沙. 基于脉搏波的亚健康识别之特征提取与选择研究[D]. 西安:西安科技大学, 2020: 37-38. |
HE S S. Feature extraction and selection research of sub-health recognition based on pulse wave[D]. Xi’an: Xi’an University of Science and Technology, 2020: 37-38. | |
3 | BOLÓN-CANEDO V, SÁNCHEZ-MAROÑO N, ALONSO- BETANZOS A. A review of feature selection methods on synthetic data[J]. Knowledge and Information Systems, 2013, 34(3): 483-519. 10.1007/s10115-012-0487-8 |
4 | CHANDRASHEKAR G, SAHIN F. A survey on feature selection methods[J]. Computers and Electrical Engineering, 2014, 40(1): 16-28. 10.1016/j.compeleceng.2013.11.024 |
5 | TANG J L, ALELYANI S, LIU H. Feature selection for classification: a review[M]// AGGARWAL C C. Data Classification: Algorithms and Applications. New York: Chapman and Hall, 2014: 37-64. |
6 | FAN J Q, LV J C. A selective overview of variable selection in high dimensional feature space[J]. Statistica Sinica, 2010, 20(1): 101-148. |
7 | 董明刚,黄宇扬,敬超. 基于遗传实例和特征选择的K近邻训练集优化方法[J]. 计算机科学, 2020, 47(8): 178-184. 10.11896/jsjkx.190700089 |
DONG M G, HUANG Y Y, JING C. K-nearest neighbor classification training set optimization method based on genetic instance and feature selection[J]. Computer Science, 2020, 47(8): 178-184. 10.11896/jsjkx.190700089 | |
8 | 陈永波,李巧勤,刘勇国. 基于动态相关性的特征选择算法[J]. 计算机应用, 2022, 42(1):109-114. 10.11772/j.issn.1001-9081.2021010128 |
CHEN Y B, LI Q Q, LIU Y G. Dynamic relevance based feature selection algorithm[J]. Journal of Computer Applications, 2022, 42(1):109-114. 10.11772/j.issn.1001-9081.2021010128 | |
9 | 崔鸿雁,徐帅,张利锋,等. 机器学习中的特征选择方法研究及展望[J]. 北京邮电大学学报, 2018, 41(1): 1-12. 10.13190/j.jbupt.2017-150 |
CUI H Y, XU S, ZHANG L F, et al. The key techniques and future vision of feature selection in machine learning[J]. Journal of Beijing University of Posts and Telecommunications, 2018, 41(1): 1-12. 10.13190/j.jbupt.2017-150 | |
10 | 王海雷. 面向高维数据的特征学习算法研究[D]. 合肥:中国科学技术大学, 2019: 7-10. |
WANG H L. Research on feature learning algorithm for high-dimensional data[D]. Hefei: University of Science and Technology of China, 2019: 7-10. | |
11 | 谢娟英,郑清泉,吉新媛. F-score结合核极限学习机的集成特征选择算法[J]. 陕西师范大学学报(自然科学版), 2020, 48(2): 1-8. |
XIE J Y, ZHENG Q Q, JI X Y. An ensemble feature selection algorithm based on F-score and kernel extreme learning machine[J]. Journal of Shaanxi Normal University (Natural Science Edition), 2020, 48(2): 1-8. | |
12 | 丁思凡,王锋,魏巍. 一种基于标签相关度的Relief特征选择算法 [J]. 计算机科学, 2021, 48(4): 91-96. 10.11896/jsjkx.200800025 |
DING S F, WANG F, WEI W. Relief feature selection algorithm based on label correlation[J]. Computer Science, 2021, 48(4): 91-96. 10.11896/jsjkx.200800025 | |
13 | JOVIĆ A, BRKIĆ K, BOGUNOVIĆ N. A review of feature selection methods with applications[C]// Proceedings of the 38th International Convention on Information and Communication Technology, Electronics and Microelectronics. Piscataway: IEEE, 2015: 1200-1205. 10.1109/mipro.2015.7160458 |
14 | KIRA K, RENDELL L A. The feature selection problem: traditional methods and a new algorithm[C]// Proceedings of the 10th National Conference on Artificial Intelligence. Menlo Park, CA: AAAI Press, 1992: 129-134. 10.1016/b978-1-55860-247-2.50037-1 |
15 | RAYMOND W J K, SING L T, KIN L W, et al. Feature pruning for partial discharge classification using IndFeat and ReliefF algorithm[C]// Proceedings of the IEEE 2nd International Conference on Dielectrics. Piscataway: IEEE, 2018: 1-4. 10.1109/icd.2018.8470035 |
16 | KONONENKO I. Estimating attributes: analysis and extensions of RELIEF[C]// Proceedings of the 1994 European Conference on Machine Learning, LNCS 784. Berlin: Springer, 1994: 171-182. |
17 | WANG G Y, XU J, XU L L, et al. Method of extracting characteristic parameters of medium-speed maglev train levitation controller based on Relief algorithm[C]// Proceedings of the 2020 Chinese Control and Decision Conference. Piscataway: IEEE, 2020: 3366-3370. 10.1109/ccdc49329.2020.9164554 |
18 | VAKHARIA V, GUPTA V K, KANKAR P K. Efficient fault diagnosis of ball bearing using ReliefF and Random Forest classifier[J]. Journal of the Brazilian Society of Mechanical Sciences and Engineering, 2017, 39(8): 2969-2982. 10.1007/s40430-017-0717-9 |
19 | WANG G Y, GAO J, HU F. A stable gene selection method based on sample weighting[C]// Proceedings of the 26th IEEE Canadian Conference on Electrical and Computer Engineering. Piscataway: IEEE, 2013: 1-4. 10.1109/ccece.2013.6567792 |
20 | SUN L, KONG X L, XU J C, et al. A hybrid gene selection method based on ReliefF and ant colony optimization algorithm for tumor classification[J]. Scientific Reports, 2019, 9: No.8978. 10.1038/s41598-019-45223-x |
21 | 廖阔,付建胜,杨万麟. 改进的ReliefF算法用于雷达距离像目标识别[J]. 电子测量与仪器学报, 2010, 24(9): 831-836. 10.3724/sp.j.1187.2010.00831 |
LIAO K, FU J S, YANG W L. Modified ReliefF algorithm for radar HRRP target recognition[J]. Journal of Electronic Measurement and Instrument, 2010, 24(9): 831-836. 10.3724/sp.j.1187.2010.00831 | |
22 | LIU W J, GAO P P, YU W B, et al. Quantum relief algorithm[J]. Quantum Information Processing, 2018, 17(10): No.280. 10.1007/s11128-018-2048-x |
23 | SAEYS Y, INZA I, LARRAÑAGA P. A review of feature selection techniques in bioinformatics[J]. Bioinformatics, 2007, 23(19): 2507-2517. 10.1093/bioinformatics/btm344 |
24 | 赵玲,龚加兴,黄大荣,等. 基于Fisher Score与最大信息系数的齿轮箱故障特征选择方法[J]. 控制与决策, 2021, 36(9): 2234-2240. |
ZHAO L, GONG J X, HUANG D R, et al. Fault feature selection method of gearbox based on Fisher Score and maximum information coefficient[J]. Control and Decision, 2021, 36(9): 2234-2240. | |
25 | RESHEF D N, RESHEF Y A, FINUCANE H K, et al. Detecting novel associations in large data sets[J]. Science, 2011, 334(6062): 1518-1524. 10.1126/science.1205438 |
26 | URBANOWICZ R J, OLSON R S, SCHMITT P, et al. Benchmarking relief-based feature selection methods for bioinformatics data mining[J]. Journal of Biomedical Informatics, 2018, 85:168-188. 10.1016/j.jbi.2018.07.015 |
27 | 尹欢一. 一种改进的多阶段ReliefF特征选择算法[J]. 信息与电脑, 2019(16): 45-47. |
YIN H Y. An improved multi-stage ReliefF feature selection algorithm[J]. China Computer and Communication, 2019(16): 45-47. | |
28 | DUA D, GRAFF C. UCI machine learning repository[DB/OL]. [2021-06-08].. |
29 | 火元莲,李俞利. 基于多特征融合与极限学习机的植物叶片分类方法[J]. 计算机工程与科学, 2021, 43(3): 486-493. 10.3969/j.issn.1007-130X.2021.03.014 |
HUO Y L, LI Y L. A plant leaf classification method based on multi feature fusion and extreme learning machine[J]. Computer Engineering and Science, 2021, 43(3):486-493. 10.3969/j.issn.1007-130X.2021.03.014 | |
30 | 肖旎旖. 基于相关性和冗余性分析的特征选择算法研究[D]. 大连:大连理工大学, 2013: 5-7. |
XIAO N Y. The research of feature selection algorithms based on analysis of relevancy and redundancy[D]. Dalian: Dalian University of Technology, 2013: 5-7. | |
31 | 包芳. 基于互信息的特征选择算法研究[D]. 长春:长春工业大学, 2021: 11-16. |
BAO F. Research on feature selection algorithm based on mutual information[D]. Changchun: Changchun University of Technology, 2021: 11-16. | |
32 | ALBASHISH D, HAMMOURI A I, BRAIK M, et al. Binary biogeography-based optimization based SVM-RFE for feature selection[J]. Applied Soft Computing, 2021, 101: No.107026. 10.1016/j.asoc.2020.107026 |
33 | 谢昆明. 基于MIC改进的PCA和CFS特征降维算法研究[D]. 武汉:湖北工业大学, 2020: 34-35. |
XIE K M. Research on PCA and CFS feature dimensionality reduction algorithm based on MIC[D]. Wuhan: Hubei University of Technology, 2020: 34-35. |
[1] | Hong CHEN, Bing QI, Haibo JIN, Cong WU, Li’ang ZHANG. Class-imbalanced traffic abnormal detection based on 1D-CNN and BiGRU [J]. Journal of Computer Applications, 2024, 44(8): 2493-2499. |
[2] | Mingzhu LEI, Hao WANG, Rong JIA, Lin BAI, Xiaoying PAN. Oversampling algorithm based on synthesizing minority class samples using relationship between features [J]. Journal of Computer Applications, 2024, 44(5): 1428-1436. |
[3] | Lin GAO, Yu ZHOU, Tak Wu KWONG. Evolutionary bi-level adaptive local feature selection [J]. Journal of Computer Applications, 2024, 44(5): 1408-1414. |
[4] | Min SUN, Qian CHENG, Xining DING. CBAM-CGRU-SVM based malware detection method for Android [J]. Journal of Computer Applications, 2024, 44(5): 1539-1545. |
[5] | Dapeng XU, Xinmin HOU. Feature selection method for graph neural network based on network architecture design [J]. Journal of Computer Applications, 2024, 44(3): 663-670. |
[6] | Shengjie MENG, Wanjun YU, Ying CHEN. Feature selection algorithm for high-dimensional data with maximum correlation and maximum difference [J]. Journal of Computer Applications, 2024, 44(3): 767-771. |
[7] | Lin SUN, Menghan LIU. K-means clustering based on adaptive cuckoo optimization feature selection [J]. Journal of Computer Applications, 2024, 44(3): 831-841. |
[8] | Jingxin LIU, Wenjing HUANG, Liangsheng XU, Chong HUANG, Jiansheng WU. Unsupervised feature selection model with dictionary learning and sample correlation preservation [J]. Journal of Computer Applications, 2024, 44(12): 3766-3775. |
[9] | Enbao QIAO, Xiangyang GAO, Jun CHENG. Self-recovery adaptive Monte Carlo localization algorithm based on support vector machine [J]. Journal of Computer Applications, 2024, 44(10): 3246-3251. |
[10] | Tian HE, Zongxin SHEN, Qianqian HUANG, Yanyong HUANG. Adaptive learning-based multi-view unsupervised feature selection method [J]. Journal of Computer Applications, 2023, 43(9): 2657-2664. |
[11] | Xueyu HUANG, Huaiyu HE, Huimin LIN, Jinshui CHEN. Classification and recognition method of copper alloy metallograph based on feature aggregation [J]. Journal of Computer Applications, 2023, 43(8): 2593-2601. |
[12] | Lin SUN, Jinxu HUANG, Jiucheng XU. Feature selection for imbalanced data based on neighborhood tolerance mutual information and whale optimization algorithm [J]. Journal of Computer Applications, 2023, 43(6): 1842-1854. |
[13] | Zhenhua YU, Zhengqi LIU, Ying LIU, Cheng GUO. Feature selection method based on self-adaptive hybrid particle swarm optimization for software defect prediction [J]. Journal of Computer Applications, 2023, 43(4): 1206-1213. |
[14] | Lin SUN, Tianjiao MA, Zhan’ao XUE. Multilabel feature selection algorithm based on Fisher score and fuzzy neighborhood entropy [J]. Journal of Computer Applications, 2023, 43(12): 3779-3789. |
[15] | Jingcheng XU, Xuebin CHEN, Yanling DONG, Jia YANG. DDoS attack detection by random forest fused with feature selection [J]. Journal of Computer Applications, 2023, 43(11): 3497-3503. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||