Journal of Computer Applications ›› 2019, Vol. 39 ›› Issue (8): 2288-2296.DOI: 10.11772/j.issn.1001-9081.2018122518

• Data science and technology • Previous Articles     Next Articles

Incremental attribute reduction algorithm of positive region in interval-valued decision tables

BAO Di1,2, ZHANG Nan1,2, TONG Xiangrong1,2, YUE Xiaodong3   

  1. 1. Key Lab for Data Science and Intelligence Technology of Shandong Higher Education Institutes(Yantai University), Yantai Shandong 264005, China;
    2. School of Computer and Control Engineering, Yantai University, Yantai Shandong 264005, China;
    3. School of Computer Engineering and Science, Shanghai University, Shanghai 200444, China
  • Received:2018-12-21 Revised:2019-04-01 Online:2019-08-10 Published:2019-04-17
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (61403329, 61572418, 61702439, 61572419, 61502410), the Shandong Provincial Natural Science Foundation (ZR2016FM42, ZR2018BA004).

区间值决策表的正域增量式属性约简算法

鲍迪1,2, 张楠1,2, 童向荣1,2, 岳晓冬3   

  1. 1. 数据科学与智能技术山东省高校重点实验室 (烟台大学), 山东 烟台 264005;
    2. 烟台大学 计算机与控制工程学院, 山东 烟台 264005;
    3. 上海大学 计算机工程与科学学院, 上海 200444
  • 通讯作者: 张楠
  • 作者简介:鲍迪(1994-),女,山东诸城人,硕士研究生,主要研究方向:粗糙集、数据挖掘、机器学习;张楠(1979-),男,山东烟台人,讲师,博士,CCF会员,主要研究方向:粗糙集、认知信息学、人工智能;童向荣(1975-),男,山东烟台人,教授,博士,主要研究方向:多Agent系统、数据挖掘、分布式人工智能;岳晓冬(1981-),男,山西太原人,副教授,博士,主要研究方向:机器学习、软计算、数据挖掘。
  • 基金资助:
    国家自然科学基金资助项目(61403329,61572418,61702439,61572419,61502410);山东省自然科学基金资助项目(ZR2016FM42,ZR2018BA004)。

Abstract: There are a large number of dynamically-increasing interval data in practical applications. If the classic non-incremental attribute reduction of positive region is used for reduction, it is necessary to recalculate the positive region reduction of the updated interval-valued datasets, which greatly reduces the computational efficiency of attribute reduction. In order to solve the problem, incremental attribute reduction methods of positive region in interval-valued decision tables were proposed. Firstly, the related concepts of positive region reduction in interval-valued decision tables were defined. Then, the single and group incremental mechanisms of positive region were discussed and proved, and the single and group incremental attribute reduction algorithms of positive region in interval-valued decision tables were proposed. Finally, 8 UCI datasets were used to carry out experiments. When the incremental size of 8 datasets increases from 60% to 100%, the reduction time of classic non-incremental attribute reduction algorithm in the 8 datasets is 36.59 s, 72.35 s, 69.83 s, 154.29 s, 80.66 s, 1498.11 s, 4124.14 s and 809.65 s, the reduction time of single incremental attribute reduction algorithm is 19.05 s, 46.54 s, 26.98 s, 26.12 s, 34.02 s, 1270.87 s, 1598.78 s and 408.65 s, the reduction time of group incremental attribute reduction algorithm is 6.39 s, 15.66 s, 3.44 s, 15.06 s, 8.02 s, 167.12 s, 180.88 s and 61.04 s. Experimental results show that the proposed incremental attribute reduction algorithm of positive region in interval-valued decision tables is efficient.

Key words: rough set, interval-valued decision table, tolerance relation, positive region, incremental attribute reduction

摘要: 实际应用中存在大量动态增加的区间型数据,若采用传统的非增量正域属性约简方法进行约简,则需要对更新后的区间值数据集的正域约简进行重新计算,导致属性约简的计算效率大大降低。针对上述问题,提出区间值决策表的正域增量属性约简方法。首先,给出区间值决策表正域约简的相关概念;然后,讨论并证明单增量和组增量的正域更新机制,提出区间值决策表的正域单增量和组增量属性约简算法;最后,通过8组UCI数据集进行实验。当8组数据集的数据量由60%增加至100%时,传统非增量属性约简算法在8组数据集中的约简耗时分别为36.59 s、72.35 s、69.83 s、154.29 s、80.66 s、1498.11 s、4124.14 s和809.65 s,单增量属性约简算法的约简耗时分别为19.05 s、46.54 s、26.98 s、26.12 s、34.02 s、1270.87 s、1598.78 s和408.65 s,组增量属性约简算法的约简耗时分别为6.39 s、15.66 s、3.44 s、15.06 s、8.02 s、167.12 s、180.88 s和61.04 s。实验结果表明,提出的区间值决策表的正域增量式属性约简算法具有高效性。

关键词: 粗糙集, 区间值决策表, 相容关系, 正域, 增量式属性约简

CLC Number: