计算机应用 ›› 2015, Vol. 35 ›› Issue (10): 2752-2756.DOI: 10.11772/j.issn.1001-9081.2015.10.2752

• 第十五届中国机器学习会议(CCML2015)论文 • 上一篇    下一篇

新颖的判别性特征选择方法

吴锦华1,2, 左开中1,2, 接标1,2,3, 丁新涛1,2   

  1. 1. 安徽师范大学 数学计算机科学学院, 安徽 芜湖 241003;
    2. 安徽师范大学 网络与信息安全工程技术研究中心, 安徽 芜湖 241003;
    3. 南京航空航天大学 计算机科学与技术学院, 南京 210016
  • 收稿日期:2015-06-16 修回日期:2015-06-27 出版日期:2015-10-10 发布日期:2015-10-14
  • 通讯作者: 吴锦华(1991-),男,安徽安庆人,硕士研究生,主要研究方向:机器学习、信息安全,ahnu_wjh@139.com
  • 作者简介:左开中(1974-),男,安徽宿州人,教授,博士,CCF会员,主要研究方向:信息安全、机器学习;接标(1977-),男,安徽宿州人,副教授,博士,主要研究方向:机器学习、医学图像处理;丁新涛(1979-),男,安徽芜湖人,讲师,博士,CCF会员,主要研究方向:模式识别、图像处理。
  • 基金资助:
    国家自然科学基金资助项目(61472005);安徽省自然科学基金资助项目(1508085MF125);模式识别国家重点实验室开放课题资助项目(201407361)。

New discriminative feature selection method

WU Jinhua1,2, ZUO Kaizhong1,2, JIE Biao1,2,3, DING Xintao1,2   

  1. 1. School of Mathematics and Computer Science, Anhui Normal University, Wuhu Anhui 241003, China;
    2. Network and Information Security Engineering Research Center, Anhui Normal University, Wuhu Anhui 241003, China;
    3. College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing Jiangsu 210016, China
  • Received:2015-06-16 Revised:2015-06-27 Online:2015-10-10 Published:2015-10-14

摘要: 作为数据预处理的一种常用的手段,特征选择不仅能够提高分类器的分类性能,而且能增加对分类结果的解释性。针对基于稀疏学习的特征选择方法有时会忽略一些有用的判别信息而影响分类性能的问题,提出了一种新的判别性特征选择方法——D-LASSO,用于选择出更具有判别力的特征。首先D-LASSO模型包含一个L1-范式正则化项,用于产生一个稀疏解;其次,为了诱导出更具有判别力的特征,模型中增加了一个新的判别性正则化项,用于保留同类样本以及不同类样本之间几何分布信息,用于诱导出更具有判别力的特征。在一系列Benchmark数据集上的实验结果表明,与已有方法相比较,D-LASSO不仅能进一步提高分类器的分类精度,而且对参数也较为鲁棒。

关键词: 特征选择, 稀疏解, L1-范式, 判别正则化项, 分类

Abstract: As a kind of common method for data preprocessing, feature selection can not only improve the classification performance, but also increase the interpretability of the classification results. In sparse-learning-based feature selection methods, some useful discriminative information is ignored, and it may affect the final classification performance. To address this problem, a new discriminative feature selection method called Discriminative Least Absolute Shrinkage and Selection Operator (D-LASSO) was proposed to choose the most discriminative features. In detail, firstly, the proposed D-LASSO method contained a L1-norm regularization item, which was used to produce sparse solution. Secondly, in order to induce the most discriminative features, a new discriminative regularization term was introduced to embed the geometric distribution information of samples with the same class label and samples with different class labels. Finally, the comparison experimental results obtained from a series of Benchmark datasets show that, the proposed D-LASSO method can not only improve the classification accuracy, but also be robust against parameters.

Key words: feature selection, sparse solution, L1-norm, discriminative regularization item, classification

中图分类号: