Multi-level feature selection algorithm based on mutual information
YONG Juya1,2, ZHOU Zhongmei1,2
1. School of Computer Science, Minnan Normal University, Zhangzhou Fujian 363000, China; 2. Key Laboratory of Data Science and Intelligence Application, Fujian Province University, Zhangzhou Fujian 363000, China
Abstract:Focusing on the problem that the process of removing redundancy will be very complicated due to the large number of the selected features, and the problem that some features only can have strong correlation with label after being combined with other features in the feature selection, a Multi-Level Feature Selection algorithm based on Mutual Information (MI_MLFS) was proposed. Firstly, the features were divided into strongly correlated, sub-strongly correlated and other features according to the degrees of correlations between features and label. Secondly, after selecting strongly correlated features, features with low redundancy in the sub-strongly correlated features were selected. Finally, the features which were able to enhance the correlation between the selected feature subset and label were selected. Among 15 datasets, MI_MLFS was compared with the algorithms of ReliefF, minimal-Redundancy-Maximal-Relevance criterion (mRMR), Joint Mutual Information (JMI), Conditional Mutual Information Maximization criterion (CMIM) and Double Input Symmetrical Relevance (DISR). The results show that MI_MLFS achieves the highest classification accuracy in 13 datasets and 11 datasets with Support Vector Machine (SVM) classifier and Classification And Regression Tree (CART) classifier respectively. MI_MLFS has better classification performance than many classical feature selection algorithms.
[1] GEORGE G, HAAS M R, PENTLAND A. Big data and management[J]. Academy of Management Journal,2014,57(2):321-326. [2] 谢娟英, 谢维信. 基于特征子集区分度与支持向量机的特征选择算法[J]. 计算机学报, 2014, 37(8):1704-1718.(XIE J Y,XIE W X. Several feature selection algorithms based on the discernibility of a feature subset and support vector machines[J]. Chinese Journal of Computers,2014,37(8):1704-1718.) [3] KWAK N,CHOI C H. Input feature selection for classification problems[J]. IEEE Transactions on Neural Networks,2002,13(1):143-159. [4] ESTEVEZ P A,TESMER M,PEREZ C A,et al. Normalized mutual information feature selection[J]. IEEE Transactions on Neural Networks,2009,20(2):189-201. [5] HOQUE N,BHATTACHARYYA D K,KALITA J K. MIFS-ND:a mutual information-based feature selection method[J]. Expert Systems with Applications,2014,41(14):6371-6385. [6] YANG H H,MOODY J. Feature selection based on joint mutual information[EB/OL].[2020-01-03]. http://pdfs.semanticscholar.org/dd69/1540c3f28decb477a7738f16aa92709b0f59.pdf. [7] VINH L T,LEE S,PARK Y T,et al. A novel feature selection method based on normalized mutual information[J]. Applied Intelligence,2012,37(1):100-120. [8] LEE J,KIM D W. Mutual Information-based multi-label feature selection using interaction information[J]. Expert Systems with Applications,2015,42(4):2013-2025. [9] KONONENKO I. Estimation attributes:analysis and extension of RELIEF[C]//Proceedings of the 1994 European Conference on Machine Learning,LNCS 784. Berlin:Springer,1994:171-182. [10] FLEURET F. Fast binary feature selection with conditional mutual information[J]. Journal of Machine Learning Research,2004,5:1531-1555. [11] YANG H H,MOODY J. Data visualization and feature selection:new algorithms for nongaussian data[C]//Proceedings of the 200012th International Conference on Neural Information Processing Systems. Cambridge:MIT Press,2000:687-693. [12] PENG H,LONG F,DING C. Feature selection based on mutual information:criteria of max-dependency,max-relevance,and minredundancy[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2005,27(8):1226-1238. [13] VINH L T,THANG N D,LEE Y K. An improved maximum relevance and minimum redundancy feature selection algorithm based on normalized mutual information[C]//Proceedings of the 201010th IEEE/IPSJ International Symposium on Applications and the Internet. Piscataway:IEEE,2010:395-398. [14] MEYER P E, SCHRETTER C, BONTEMPI G. Informationtheoretic feature selection in microarray data using variablecomplementarity[J]. IEEE Journal of Selected Topics in Signal Processing,2008,2(3):261-274. [15] 段宏湘, 张秋余, 张墨逸. 基于归一化互信息的FCBF特征选择算[J]. 华中科技大学学报(自然科学版), 2017, 45(1):52-56. (DUAN H X,ZHANG Q Y,ZHANG M Y. FCBF algorithm based on normalized mutual information for feature selection[J]. Journal of Huazhong University of Science and Technology (Natural Science Edition),2017,45(1):52-56.) [16] JOHN G H,KOHAVI R,PFLEGER K. Irrelevant features and the subset selection problem[C]//Proceedings of the 199411th International Conference on Machine Learning. San Francisco:Morgan Kaufmann Publisher,1994:121-129. [17] 张俐, 王枞. 基于最大相关最小冗余联合互信息的多标签特征选择算法[J]. 通信学报, 2018, 39(5):111-122.(ZHANG L, WANG C. Multi-label feature selection algorithm based on mutual information of max-relevance and min-redundancy[J]. Journal on Communications,2018,39(5):111-122.) [18] YANG Y. Book Review of Elements of Information Theory by COVER T M,THOMAS J A[J]. Publications of the American Statal Association,2008,103:429-429. [19] 张振海, 李士宁, 李志刚, 等. 一类基于信息熵的多标签特征选择算法[J]. 计算机研究与发展, 2013, 50(6):1177-1184. (ZHANG Z H,LI S N,LI Z G,et al. Multi-label feature selection algorithm based on information entropy[J]. Journal of Computer Research and Development,2013,50(6):1177-1184.) [20] LIN Y,HU Q,LIU J,et al. Multi-label feature selection based on max-dependency and min-redundancy[J]. Neurocomputing, 2015,168:92-103. [21] RAHMANINIA M,MORADI P. OSFSMI:online stream feature selection method based on mutual information[J]. Applied Soft Computing,2018,68:733-746. [22] 牛晓太. 基于KNN算法和10折交叉验证法的支持向量选取算法[J]. 华中师范大学学报(自然科学版), 2014, 48(3):335-338.(NIU X T. Support vector extracted algorithm based on KNN and 10 fold cross-validation method[J]. Journal of Huazhong Normal University(Natural Sciences),2014,48(3):335-338.)