Low rank non-linear feature selection algorithm

doi:10.11772/j.issn.1001-9081.2018050954

Journal of Computer Applications ›› 2018, Vol. 38 ›› Issue (12): 3444-3449.DOI: 10.11772/j.issn.1001-9081.2018050954

Previous Articles Next Articles

Low rank non-linear feature selection algorithm

ZHANG Leyuan, LI Jiaye, LI Pengqing

College of Computer Science and Information Engineering, Guangxi Normal University, Guilin Guangxi 541004, China

Received:2018-05-08 Revised:2018-06-29 Online:2018-12-15 Published:2018-12-10
Contact: 张乐园
Supported by:
This work is partially supported by the National Key Research and Development Program of China (2016YFB1000905), the National Natural Science Foundation of China (61170131, 61263035, 61573270, 90718020), the National Basic Research Program (973 Program) of China (2013CB329404), the China Postdoctoral Science Foundation (2015M570837), the Natural Science Foundation of Guangxi (2015GXNSFCB139011, 2015GXNSFAA139306).

低秩约束的非线性属性选择算法

张乐园, 李佳烨, 李鹏清

广西师范大学计算机科学与信息工程学院, 广西桂林 541004

通讯作者: 张乐园
作者简介:张乐园(1995-),男,安徽蒙城人,硕士研究生,主要研究方向:机器学习、数据挖掘;李佳烨(1993-),男,山西晋城人,硕士研究生,主要研究方向:机器学习、数据挖掘;李鹏清(1993-),男,广东深圳人,硕士研究生,主要研究方向:机器学习、数据挖掘。
基金资助:
国家重点研发计划项目（2016YFB1000905）；国家自然科学基金资助项目（61170131，61263035，61573270，90718020）；国家973计划项目（2013CB329404）；中国博士后科学基金资助项目（2015M570837）；广西自然科学基金资助项目（2015GXNSFCB139011，2015GXNSFAA139306）。

Abstract

Abstract: Concerning the problems of high-dimensional data, such as non-linearity, low-rank form, and feature redundancy, an unsupervised feature selection algorithm based on kernel function was proposd, named Low Rank Non-linear Feature Selection algroithm (LRNFS). Firstly, the features of each dimension were mapped to a high-dimensional kernel space, and the non-linear feature selection in the low-dimensional space was achieved through the linear feature selection in the kernel space. Then, the deviation terms were introduced into the self-expression form, and the low rank and sparse processing of coefficient matrix were achieved. Finally, the sparse regularization factor of kernel matrix coefficient vector was introduced to implement the feature selection. In the proposed algorithm, the kernel matrix was used to represent its non-linear relationship, the global information of data was taken into account in low rank to perform subspace learning, and the importance of feature was determined by the self-expression form. The experimental results show that, compared with the semi-supervised feature selection algorithm via Rescaled Linear Square Regression (RLSR), the classification accuracy of the proposed algorithm after feature selection is increased by 2.34%. The proposed algorithm can solve the problem that the data is linearly inseparable in the low-dimensional feature space, and improve the accuracy of feature selection.

Key words: feature selection, kernel function, subspace learning, low rank representation, sparse processing

摘要： 针对高维的数据中往往存在非线性、低秩形式和属性冗余等问题，提出一种基于核函数的属性自表达无监督属性选择算法——低秩约束的非线性属性选择算法（LRNFS）。首先，将每一维的属性映射到高维的核空间上，通过核空间上的线性属性选择去实现低维空间上的非线性属性选择；然后，对自表达形式引入偏差项并对系数矩阵进行低秩与稀疏处理；最后，引入核矩阵的系数向量的稀疏正则化因子来实现属性选择。所提算法中用核矩阵来体现其非线性关系，低秩考虑数据的全局信息进行子空间学习，自表达形式确定属性的重要程度。实验结果表明，相比于基于重新调整的线性平方回归（RLSR）半监督特征选择算法，所提算法进行属性选择之后作分类的准确率提升了2.34%。所提算法解决了数据在低维特征空间上线性不可分的问题，提升了属性选择的准确率。

关键词: 属性选择, 核函数, 子空间学习, 低秩表示, 稀疏处理

CLC Number:

TP181

ZHANG Leyuan, LI Jiaye, LI Pengqing. Low rank non-linear feature selection algorithm[J]. Journal of Computer Applications, 2018, 38(12): 3444-3449.

张乐园, 李佳烨, 李鹏清. 低秩约束的非线性属性选择算法[J]. 计算机应用, 2018, 38(12): 3444-3449.

References

[1] ZHU X F, LI X L, ZHANG S C. Block-row sparse multiview multilabel learning for image classification[J]. IEEE Transactions on Cybernetics, 2016, 46(2):450-461.
[2] YANG Y, ZHA Z J, GAO Y, et al. Exploting web images for semantic video indexing via robust sample-specific loss[J]. IEEE Transactions on Cybernetics, 2014, 16(6):1677-1689.
[3] ZHU X F, HUANG Z, SHEN H T, et al. Linear cross-modal hashing for effective multimedia search[C]//Proceedings of the 21st ACM International Conference on Multimedia. New York:ACM, 2013:143-152.
[4] ZHU X F, ZHANG S C, JIN Z, et al. Missing value estimation for mixed-attribute data sets[J]. IEEE Transactions on Knowledge and Data Engineering, 2011, 23(1):110-121.
[5] GU Q Q, LI Z H, HAN J W. Joint feature selection and subspace learning[C]//Proceedings of the 201122nd International Joint Conference on Artificial Intelligence. Menlo Park, CA:AAAI, 2011:1294-1299.
[6] ZHANG S C, QIN Z, LING C X, et al. "Missing is useful":missing values in cost-sensitive decision trees[J]. IEEE Transactions on Knowledge and Data Engineering, 2005, 17(12):1689-1693.
[7] 周志华.机器学习[M].北京:清华大学出版社,2016:126-129.(ZHOU Z H. Machine Learning[M]. Beijing:Tsinghua University Press, 2016:126-129.)
[8] VARMA M, BABU B R. More generally in efficient multiple kernel learning[C]//Proceedings of the 26th Annual International Conference on Machine Learning. New York:ACM, 2009:1065-1072.
[9] LI Y D, LEI C, FANG Y, et al. Unsupervised feature selection by combining subspace learning with feature self-representation[J]. Pattern Recognition Letters, 2017, 109:35-43.
[10] GU Q Q, LI Z H, HAN J W. Linear discriminant dimensionality reduction[C]//Proceedings of the 2011 Joint European Conference on Machine Learning and Knowledge Discovery in Databases, LNCS 6911. Berlin:Springer, 2011:549-564.
[11] MVLLER K R, MIKA S, RÄTSCH G, et al. An introduction to kernel-based learning algorithm[J]. IEEE Transactions on Neural Networks, 2001, 12(2):181-201.
[12] 王华忠,俞金寿.核函数方法及其模型选择[J].江南大学学报(自然科学版),2006,5(4):500-504.(WANG H Z, YU J S. Study on the kernel-based methods and its model selection[J]. Journal of Southern Yangtze University (Natural Science Edition), 2006, 5(4):500-504.)
[13] LU C Y, LIN Z C, YAN S C. Smoothed low rank and sparse matrix recovery by iteratively reweighted least squares minimization[J]. IEEE Transactions on Image Processing, 2015, 24(2):646-654.
[14] DAUBECHIES I, DEVORE R, FORNASIER M,et al. Iteratively reweighted least squares minimization for sparse recovery[J]. Communications on Pure and Applied Mathematics, 2008, 63(1):1-38.
[15] 宗鸣,龚永红,文国秋,等.基于稀疏学习的kNN分类[J].广西师范大学学报(自然科学版),2016,34(3):39-45.(ZONG M, GONG Y H, WEN G Q, et al. kNN classification based on sparse learning[J]. Journal of Guangxi Normal University (Natural Science Edition), 2016, 34(3):39-45.)
[16] PARUOLO P. Multivariate reduced-rank regression:theory and applications[J]. Journal of the American Statistical Association, 1998, 95(450):683-685.
[17] UCI. Repository of machine learning data sets[DB/OL].[2018-04-06]. http://archive.ics.uci.edu./ml/.
[18] FAN Z Z, XU Y, ZHANG D. Local linear discriminant analysis framework using sample neighbors[J]. IEEE Transactions on Neural Networks, 2011, 22(7):1119-1132.
[19] NIE F P, ZHU W, LI X L. Unsupervised feature selection with structured graph optimization[C]//Proceedings of the 30th AAAI Conference on Artificial Intelligence. Menlo Park, CA:AAAI, 2016:1302-1308.
[20] CHEN X, YUAN G, NIE F,et al. Semi-supervised feature selection via rescaled linear regression[C]//Proceedings of the 26th International Joint Conference on Artificial Intelligence. Menlo Park, CA:AAAI, 2017:1525-1531.
[21] ZHU P F, ZUO W M, ZHANG L,et al. Unsupervised feature selection by regularized self-representation[J]. Pattern Recognition, 2015, 48(2):438-446.
[22] YAMADA M, JITKRITTUM W, SIGAL L, et al. High-dimensional feature selection by feature-wise kernelized Lasso[J]. Neural Computation, 2014, 26(1):185-207.
[23] LIBSVM. A library for support vector machines[EB/OL].[2018-04-06]. http://www.csie.nu.edu.tw/~cjlin/libsvm.

Low rank non-linear feature selection algorithm

低秩约束的非线性属性选择算法

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics

[1]	Hong CHEN, Bing QI, Haibo JIN, Cong WU, Li’ang ZHANG. Class-imbalanced traffic abnormal detection based on 1D-CNN and BiGRU [J]. Journal of Computer Applications, 2024, 44(8): 2493-2499.
[2]	Mingzhu LEI, Hao WANG, Rong JIA, Lin BAI, Xiaoying PAN. Oversampling algorithm based on synthesizing minority class samples using relationship between features [J]. Journal of Computer Applications, 2024, 44(5): 1428-1436.
[3]	Lin GAO, Yu ZHOU, Tak Wu KWONG. Evolutionary bi-level adaptive local feature selection [J]. Journal of Computer Applications, 2024, 44(5): 1408-1414.
[4]	Lin SUN, Menghan LIU. K-means clustering based on adaptive cuckoo optimization feature selection [J]. Journal of Computer Applications, 2024, 44(3): 831-841.
[5]	Dapeng XU, Xinmin HOU. Feature selection method for graph neural network based on network architecture design [J]. Journal of Computer Applications, 2024, 44(3): 663-670.
[6]	Shengjie MENG, Wanjun YU, Ying CHEN. Feature selection algorithm for high-dimensional data with maximum correlation and maximum difference [J]. Journal of Computer Applications, 2024, 44(3): 767-771.
[7]	Jingxin LIU, Wenjing HUANG, Liangsheng XU, Chong HUANG, Jiansheng WU. Unsupervised feature selection model with dictionary learning and sample correlation preservation [J]. Journal of Computer Applications, 2024, 44(12): 3766-3775.
[8]	Tian HE, Zongxin SHEN, Qianqian HUANG, Yanyong HUANG. Adaptive learning-based multi-view unsupervised feature selection method [J]. Journal of Computer Applications, 2023, 43(9): 2657-2664.
[9]	Lin SUN, Jinxu HUANG, Jiucheng XU. Feature selection for imbalanced data based on neighborhood tolerance mutual information and whale optimization algorithm [J]. Journal of Computer Applications, 2023, 43(6): 1842-1854.
[10]	Zhenhua YU, Zhengqi LIU, Ying LIU, Cheng GUO. Feature selection method based on self-adaptive hybrid particle swarm optimization for software defect prediction [J]. Journal of Computer Applications, 2023, 43(4): 1206-1213.
[11]	Mengting GE, Minghua WAN. Feature extraction model based on neighbor supervised locally invariant robust principal component analysis [J]. Journal of Computer Applications, 2023, 43(4): 1013-1020.
[12]	Lin SUN, Tianjiao MA, Zhan’ao XUE. Multilabel feature selection algorithm based on Fisher score and fuzzy neighborhood entropy [J]. Journal of Computer Applications, 2023, 43(12): 3779-3789.
[13]	Jingcheng XU, Xuebin CHEN, Yanling DONG, Jia YANG. DDoS attack detection by random forest fused with feature selection [J]. Journal of Computer Applications, 2023, 43(11): 3497-3503.
[14]	Lei MA, Chuan LUO, Tianrui LI, Hongmei CHEN. Fuzzy-rough set based unsupervised dynamic feature selection algorithm [J]. Journal of Computer Applications, 2023, 43(10): 3121-3128.
[15]	Liang CHEN, Xianfeng TANG. Improved sine cosine algorithm for optimizing feature selection and data classification [J]. Journal of Computer Applications, 2022, 42(6): 1852-1861.