计算机应用 ›› 2010, Vol. 30 ›› Issue (4): 993-996.

• 人工智能 • 上一篇    下一篇

基于改进的F-score与支持向量机的特征选择方法

谢娟英1,王春霞1,蒋帅2,张琰1   

  1. 1. 陕西师范大学
    2.
  • 收稿日期:2009-10-12 修回日期:2009-12-03 发布日期:2010-04-15 出版日期:2010-04-01
  • 通讯作者: 谢娟英

Feature selection method combing improved F-score and support vector machine

  • Received:2009-10-12 Revised:2009-12-03 Online:2010-04-15 Published:2010-04-01
  • Contact: Juanying Xie

摘要: 将传统F-score度量样本特征在两类之间的辨别能力进行推广,提出了改进的F-score,使其不但能够评价样本特征在两类之间的辨别能力,而且能够度量样本特征在多类之间的辨别能力大小。以改进的F-score作为特征选择准则,用支持向量机(SVM)评估所选特征子集的有效性,实现有效的特征选择。通过UCI机器学习数据库中六组数据集的实验测试,并与SVM、PCA+SVM方法进行比较,证明基于改进F-score与SVM的特征选择方法不仅提高了分类精度,并具有很好的泛化能力,且在训练时间上优于PCA+SVM方法。

关键词: F-score, 支持向量机, 特征选择, 主成分分析, 核函数主成分分析

Abstract: The original F-score can only measure the discrimination of two sets of real numbers. This paper proposed the improved F-score which can not only measure the discrimination of two sets of real numbers, but also the discrimination of more than two sets of real numbers. The improved F-score and Support Vector Machines (SVM) were combined in this paper to accomplish the feature selection process where the improved F-score was used as the evaluation criterion of feature selection, and SVM to evaluate the features selected via the improved F-score. Experiments have been conducted on six different groups from UCI machine learning database. The experimental results show that the feature selection method, based on the improved F-score and SVM, has high classification accuracy and good generalization, and spends less training time than that of the Principle Component Analysis (PCA)+SVM method.

Key words: F-score, Support Vector Machine (SVM), feature selection, Principle Component Analysis (PCA), Kernel Principal Component Analysis (KPCA)