计算机应用 ›› 2016, Vol. 36 ›› Issue (11): 2958-2962.DOI: 10.11772/j.issn.1001-9081.2016.11.2958

• 第十六届中国粗糙集与软计算联合学术会议(CRSSC 2016)论文 • 上一篇    下一篇

基于扩展容差关系的不完备信息系统属性约简

罗豪, 续欣莹, 谢珺, 张扩, 谢新林   

  1. 太原理工大学 信息工程学院, 太原 030600
  • 收稿日期:2016-06-07 修回日期:2016-06-20 出版日期:2016-11-10 发布日期:2016-11-12
  • 通讯作者: 续欣莹
  • 作者简介:罗豪(1990-),男,河南周口人,硕士研究生,主要研究方向:机器学习、数据挖掘、智能信息处理;续欣莹(1979-),男,山西定襄人,副教授,博士,CCF会员,主要研究方向:粒计算、大数据分析、机器学习;谢珺(1979-),女,山西五台人,副教授,博士,主要研究方向:粒计算、粗糙集、数据挖掘、机器学习;张扩(1991-),男,辽宁朝阳人,硕士研究生,主要研究方向:机器学习,数据挖掘、智能信息处理;谢新林(1990-),男,山西绛县人,博士研究生,CCF会员,主要研究方向:计算机视觉、粒计算、进化计算。
  • 基金资助:
    山西省自然科学基金资助项目(2014011018-2);山西省回国留学人员科研资助项目(2013-033,2015-45)。

Attribute reduction in incomplete information systems based on extended tolerance relation

LUO Hao, XU Xinying, XIE Jun, ZHANG Kuo, XIE Xinlin   

  1. College of Information Engineering, Taiyuan University of Technology, Taiyuan Shanxi 030600, China
  • Received:2016-06-07 Revised:2016-06-20 Online:2016-11-10 Published:2016-11-12
  • Supported by:
    This work is partially supported by the Provincial Natural Science Foundation of Shanxi (2014011018-2), Shanxi Province Science Foundation for Returness (2013-033, 2015-045).

摘要: 针对当前的邻域粗糙集多用于处理完备的信息系统,而非不完备的信息系统这一问题,提出了一种可用于处理不完备混合信息系统的扩展容差关系,并给出相关定义,使用容差完备度和邻域阈值作为限制条件计算扩展容差邻域,以此邻域为基础选择决策正域得到系统的属性重要性,并以该重要性作为启发因子给出基于扩展容差关系的属性约简算法。采用UCI数据集中的7组不同类型的数据集进行仿真实验,并分别与扩展邻域关系(EN)、容差邻域熵(TRE)、邻域粗糙集(NR)的方法进行比较,实验结果表明,该方法在保证分类精度的同时能够约简得到更少的属性。最后讨论了在扩展容差关系中改变邻域阈值对分类精度产生的影响。

关键词: 邻域粗糙集, 不完备信息, 属性约简, 属性重要性, 邻域阈值

Abstract: Current neighborhood rough sets have been usually used to solve complete information system, not incomplete system. In order to solve this problem, an extended tolerance relation was proposed to deal with the incomplete mixed information system, and associative definitions were provided. The degree of complete tolerance and neighborhood threshold were used as the constraint conditions to find the extended tolerance neighborhood. The attribute importance of the system was got by the decision positive region within the neiborhood, and the attribute reduction algorithm based on the extended tolerance relation was proposed, which was given by the importance as the heuristic factor. Seven different types of data sets on UCI database was used for simulation, and the proposed method was compared with Extension Neighborhood relation (EN), Tolerance Neighborhood Entropy (TRE) and Neighborhood Rough set (NR) respectively. The experimental results show that, the proposed algorithm can ensure accuracy of classification, select less attributes by reduction. Finally, the influence of neighborhood threshold in extended tolerance relation on classification accuracy was discussed.

Key words: Neighborhood Rough set (NR), incomplete information, attribute reduction, attribute significance, neighborhood threshold

中图分类号: