计算机应用 ›› 2011, Vol. 31 ›› Issue (05): 1318-1320.DOI: 10.3724/SP.J.1087.2011.01318

• 人工智能 • 上一篇    下一篇

基于特征选择的多侧面覆盖算法

吴涛1,2,张方方2   

  1. 1.安徽大学 计算智能与信号处理教育部重点实验室, 合肥 230039
    2.安徽大学 数学科学学院, 合肥 230039
  • 收稿日期:2010-11-17 修回日期:2011-01-05 发布日期:2011-05-01 出版日期:2011-05-01
  • 通讯作者: 张方方
  • 作者简介:吴涛(1970-),男,安徽太和人,教授,博士,主要研究方向:机器学习、智能计算;张方方(1986-),女,安徽蒙城人,硕士研究生,主要研究方向:机器学习、智能计算。
  • 基金资助:

    国家自然科学基金资助项目(60675031);国家973计划项目(2007BC311003);安徽省高等学校省级自然科学研究项目(KJ2008B093);安徽大学创新团队(KJTD001B);安徽大学人才队伍建设经费资助项目。

Multi-side covering algorithm based on feature selection

WU Tao1,2, ZHANG Fang-fang2   

  1. 1.Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Anhui University, Hefei Anhui 230039, China
    2.School of Mathematical Sciences, Anhui University, Hefei Anhui 230039, China
  • Received:2010-11-17 Revised:2011-01-05 Online:2011-05-01 Published:2011-05-01
  • Contact: Fangfang Zhang

摘要: 多侧面覆盖算法对海量高维数据的分类采用分而治之的思想,依据分量差的绝对值和,选取部分属性构建不同样本子集的覆盖,降低了学习的复杂度,但初始属性集的选择依据经验或实验获得。为降低初始属性集选择的主观性和属性集调整的复杂性,利用Relief特征选择方法确定适合不同数据集的最优特征子集,构建了分层递阶的覆盖网络,并对实际数据集进行实验。实验结果表明,该算法具有较高的精度和效率,可以有效地实现复杂问题的分类。

关键词: 覆盖算法, 特征选择, 多侧面递进

Abstract: The multi-side covering algorithm is designed guided by the idea of divide-and-conquer to the mass high-dimensional data. According to the sum of the absolute value of the component deviation, subsets of attributes were selected to construct respective covering domains for different parts of training samples, thus reducing the complexity of learning. But the selection of initial attribute set should be acquired by experience or experiments. In order to reduce the subjectivity with the selection of initial attribute set and the complexity with the regulation of attribute set, the relief feature selection approach was used to ensure the optimal feature subset that can be appropriate for different data sets, build a hierarchical overlay network, and experiment on the actual data set. The experimental results show that this algorithm is provided with higher precision and efficiency. Therefore, the algorithm can effectively achieve the classification of the complex issues.

Key words: covering algorithm, feature selection, multi-side increase by degree