计算机应用 ›› 2019, Vol. 39 ›› Issue (11): 3120-3126.DOI: 10.11772/j.issn.1001-9081.2019050864

• 2019年中国粒计算与知识发现学术会议(CGCKD2019)论文 • 上一篇    下一篇

面向聚类集成的基聚类三支筛选方法

徐健锋1,2,3, 邹伟康1, 梁伟2, 程高洁2, 张远健3   

  1. 1. 南昌大学 信息工程学院, 南昌 330031;
    2. 南昌大学 软件学院, 南昌 330047;
    3. 同济大学 电子与信息工程学院, 上海 201804
  • 收稿日期:2019-05-06 修回日期:2019-06-10 出版日期:2019-11-10 发布日期:2019-09-11
  • 通讯作者: 徐健锋
  • 作者简介:徐健锋(1973-),男,江西南昌人,教授,博士,CCF会员,主要研究方向:粒计算、粗糙集、三支决策、深度学习、聚类集成;邹伟康(1995-),男,江西吉安人,硕士研究生,主要研究方向:粒计算、粗糙集、三支决策、聚类集成;梁伟(1993-),男,江苏连云港人,硕士研究生,主要研究方向:机器学习、粒计算、粗糙集、三支决策、深度学习、聚类集成;程高洁(1985-),女,江西临川人,讲师,硕士,主要研究方向:粒计算、粗糙集、机器学习、聚类集成;张远健(1990-),男,江苏扬州人,博士研究生,CCF会员,主要研究方向:粒计算、三支决策、多标签学习。
  • 基金资助:
    国家自然科学基金资助项目(61763031,61673301);国家重点研发计划项目(213)。

Three-way screening method of basic clustering for ensemble clustering

XU Jianfeng1,2,3, ZOU Weikang1, LIANG Wei2, CHENG Gaojie2, ZHANG Yuanjian3   

  1. 1. School of Information Engineering, Nanchang University, Nanchang Jiangxi 330031, China;
    2. School of Software, Nanchang University, Nanchang Jiangxi 330047, China;
    3. College of Electronics and Information Engineering, Tongji University, Shanghai 201804, China
  • Received:2019-05-06 Revised:2019-06-10 Online:2019-11-10 Published:2019-09-11
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (61763031, 61673301), the National Key Research and Development Program of China (213).

摘要: 当前聚类集成的研究主要是围绕着集成策略的优化展开,而针对基聚类质量的度量及优化却较少研究。基于信息熵理论提出了一种基聚类的质量度量指标,并结合三支决策思想构造了面向基聚类的三支筛选方法。首先预设基聚类筛选三支决策的阈值α、β,然后计算各基聚类中类簇质量的平均值,并把其作为各基聚类的质量度量指标,最后实施三支决策。决策策略为:当某个基聚类的质量度量指标小于阈值β时,删除该基聚类;当某个基聚类的质量度量指标大于等于阈值α时,保留该基聚类;当某个基聚类的质量度量指标大于等于β小于α时,重新计算该基聚类质量,并且再次实施上述三支决策直至没有基聚类被删除或达到指定迭代次数。对比实验结果表明,基聚类三支筛选方法能够有效提升聚类集成效果。

关键词: 三支决策, 聚类集成, 基聚类, 三支筛选

Abstract: At present, the researches of ensemble clustering mainly focus on the optimization of ensemble strategy, while the measurement and optimization of the quality of basic clustering are rarely studied. On the basis of information entropy theory, a quality measurement index of basic clustering was proposed, and a three-way screening method for basic clustering was constructed based on three-way decision. Firstly, α, β were reset as the thresholds of three-way decision of basic clustering screening. Secondly, the average cluster quality of each basic clustering was calculated and was used as the quality measurement index of each basic clustering. Finally, the three-way decision was implemented. For one three-way screening, its decision strategy is:1) deleting the basic clustering if the quality measurement index of the basic clustering is less than the threshold β; 2) keeping the basic clustering if the quality measurement index of the basic clustering is greater than or equals to the threshold α; 3) recalculating the quality of a basic clustering and if the quality measurement index of the basic clustering is greater than β and less than α or equals to β. For the third option, the decision process continues until there is no deletion of basic clustering or reaching the times of iteration. The comparative experiments show that the three-way screening method of basic clustering can effectively improve the ensemble clustering effects.

Key words: three-way decision, ensemble clustering, basic clustering, three-way screening

中图分类号: