Journal of Computer Applications ›› 2015, Vol. 35 ›› Issue (11): 3116-3121.DOI: 10.11772/j.issn.1001-9081.2015.11.3116

• DPCS 2015 Paper • Previous Articles     Next Articles

New classification method based on neighborhood relation fuzzy rough set

HU Xuewei, JIANG Yun, LI Zhilei, SHEN Jian, HUA Fengliang   

  1. College of Computer Science and Engineering, Northwest Normal University, Lanzhou Gansu 730070, China
  • Received:2015-06-17 Revised:2015-07-08 Published:2015-11-13


胡学伟, 蒋芸, 李志磊, 沈健, 华锋亮   

  1. 西北师范大学 计算机科学与工程学院, 兰州 730070
  • 通讯作者: 胡学伟(1991-),男,陕西咸阳人,硕士研究生,主要研究方向:数据挖掘、粗糙集.
  • 作者简介:蒋芸(1970-),女,浙江绍兴人,教授,博士,CCF会员,主要研究方向:数据挖掘、粗糙集; 李志磊(1991-),女,河北衡水人,主要研究方向:数据挖掘、粗糙集; 沈健(1990-),男,安徽合肥人,硕士研究生,主要研究方向:数据挖掘、粗糙集; 华锋亮(1989-),女,陕西咸阳人,硕士研究生,主要研究方向:分布式与并行计算.
  • 基金资助:

Abstract: Since fuzzy rough sets induced by fuzzy equivalence relations can not quite accurately reflect decision problems described by numerical attributes among fuzzy concept domain, a fuzzy rough set model based on neighborhood relation called NR-FRS was proposed. First of all, the definitions of the rough set model were presented. Based on properties of NR-FRS, a fuzzy neighborhood approximation space reasoning was carried out, and attribute dependency in characteristic subspace was also analyzed. Finally, feature selection algorithm based on NR-FRS was presented, and feature subsets was constructed next, which made fuzzy positive region greater than a specific threshold, thereby getting rid of redundant features and reserving attributes that have a strong capability in classification. Classification experiment was implemented on UCI standard data sets, which used Radial Basis Function (RBF) support vector machine as the classifier. The experimental results show that, compared with fast forward feature selection based on neighborhood rough set as well as Kernel Principal Component Analysis (KPCA), feature number of the subset obtained by NR-FRS model feature selection algorithm changes more smoothly and stably according to parameters. Meanwhile, average classification accuracy increases by 5.2% in the best case and varies stably according to parameters.

Key words: granulating and approximation, feature selection, neighborhood relation, attribute dependence

摘要: 针对目前模糊等价关系所诱导的模糊粗糙集模型不能准确地反映模糊概念范畴中数值属性描述的决策问题,提出一种基于邻域关系的模糊粗糙集模型NR-FRS,给出了该粗糙集模型的相关定义,在讨论模型性质的基础上进行模糊化邻域近似空间上的推理,并分析特征子空间下的属性依赖性;最后在NR-FRS的基础上提出特征选择算法,构建使得模糊正域增益优于具体阈值的特征子集,进而剔除冗余特征,保留分类能力强的属性.采用UCI标准数据集进行分类实验,使用径向基核函数(RBF)支持向量机作为分类器.实验结果表明,同基于邻域粗糙集的快速前向特征选择方法以及核主成分分析方法(KPCA)相比,NR-FRS模型特征选择算法所得特征子集中特征数量依据参数变化更加平缓、稳定.同时平均分类准确率提升最好可以达到5.2%,且随特征选择参数呈现更加平稳的变化.

关键词: 粒化和逼近, 特征选择, 邻域关系, 属性依赖性

CLC Number: