Unsupervised feature selection approach based on spectral analysis
Feng PAN1,2,Jiang-dong WANG1,Ben NIU2
1. College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing Jiangsu 210016, China
2. College of Management, Shenzhen University, Shenzhen Guangdong 518060, China
Abstract:To improve the performance of feature selection under the unsupervised scenario, the relationship between the distribution of the first K minimal eigenvalues for a normalized graph Laplacian matrix and the structure of the clusters was identified, and a new feature selection algorithm based on the spectral analysis was proposed. The feature selection algorithm might be time-consuming; hence the Nystrm method was applied to reduce the computational cost of the eigen-decomposition. The experiments on synthetic and real-world data sets show the efficiency of the proposed approach.
潘锋 王建东 牛奔. 基于谱分析的无监督特征选择算法[J]. 计算机应用, 2011, 31(08): 2108-2110.
Feng PAN Jiang-dong WANG Ben NIU. Unsupervised feature selection approach based on spectral analysis. Journal of Computer Applications, 2011, 31(08): 2108-2110.
DASH M. Dimensionality reduction of unsupervised data [C]// Proceedings of the Ninth IEEE International Conference on Tools with Artificial Intelligence. Washington, DC: IEEE Computer Society, 1997: 532-539.
[2]
DY J G, BRODLEY C E. Feature subset selection and order identification for unsupervised learning [C]// Proceedings of the Seventeenth International Conference on Machine Learning. San Francisco: Morgan Kaufmann Publishers, 2000: 247-254.
[3]
RODRIGUEZ-LUJAN I, HUERTA R. Quadratic programming feature selection [J]. Journal of Machine Learning Research, 2010, 11: 1491-1516.
[4]
SHI J, MALIK J. Normalized cuts and image segmentation [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000, 22(8): 888-905.
[5]
BELABBAS M A, WOLFE P J. Spectral methods in machine learning and new strategies for very large datasets [J]. Proceedings of the National Academy of Sciences, 2009, 106(2):369-374.
[6]
von LUXBURG U. A tutorial on spectral clustering [J]. Statistics and Computing, 2007, 17(4): 395-416.
[7]
FOWLKES C, BELONGIE S, CHUNG F, et al. Spectral grouping using the Nystrm method [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2004, 26(2):214-225.
[8]
TSAI C Y, CHIU C C. An efficient feature selection approach for clustering: Using a Gaussian mixture model of data dissimilarity [C]// 2007 International Conference on Computational Science and its Applications. Berlin: Springer-Verlag, 2007: 1107-1118.
[9]
HE XIAO-FEI, CAI DENG, NIYOGI P. Laplacian score for feature selection [C]// Advances in Neural Information Processing Systems 18. Cambridge, MA: MIT Press, 2006: 507-514.
[10]
MITRA P. Unsupervised feature selection using feature similarity [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002, 24(3):301-312.
[11]
VARSHAVSKY R. Novel unsupervised feature filtering of biological data [J]. Bioinformatics, 2006, 22(14):507-513.