计算机应用 ›› 2010, Vol. 30 ›› Issue (06): 1530-1532.

• 人工智能 • 上一篇    下一篇

基于Hellinger距离的特征选择算法

李伟湋1,贾修一2   

  1. 1. 南京航空航天大学
    2.
  • 收稿日期:2010-01-04 修回日期:2010-03-05 发布日期:2010-06-01 出版日期:2010-06-01
  • 通讯作者: 李伟湋
  • 基金资助:
    基于粒计算的离群点检测技术研究

Feature selection algorithm based on Hellinger distance

  • Received:2010-01-04 Revised:2010-03-05 Online:2010-06-01 Published:2010-06-01
  • Contact: Wei-Wei LI

摘要: 针对数据挖掘中的特征选择问题,依据Hellinger距离的特性,研究了两种Hellinger距离的定义方式,提出了基于Hellinger距离的特征选择方法,设计了两种相应的算法。不同数据集上的实验结果表明了新算法选择的特征的有效性。与其他特征选择算法的对比可发现:这两种算法选择的特征个数少且对C4.5分类精度较好。

关键词: 特征选择, Hellinger Distance, 数据挖掘

Abstract: To solve the feature selection problem, two kinds of definitions of Hellinger distance were studied in this paper, and the corresponding feature selection algorithms based on Hellinger distance were also proposed. The experiments on different data sets show the efficiency of the two algorithms. Compared with other feature selection algorithms, the feature selection algorithms based on Hellinger distance can get fewer features, which are useful for C4.5 and can improve the average accuracy of the classification in the learned data sets.

Key words: feature selection, Hellinger distance, data mining