计算机应用 ›› 2005, Vol. 25 ›› Issue (02): 348-351.DOI: 10.3724/SP.J.1087.2005.0348

• 软件技术 • 上一篇    下一篇

模糊K-Prototypes算法中的加权指数研究

汪加才1,朱艺华2   

  1. 1.南京审计学院计算机科学与技术系; 2.浙江工业大学信息智能与决策优化研究所
  • 发布日期:2005-02-01 出版日期:2005-02-01
  • 基金资助:

    国家自然科学基金资助项目(60473097);;江苏省高校自然科学研究计划项目(03KJB520054

Research on the weighting exponent in fuzzy K-Prototypes algorithm

WANG Jia-cai1,ZHU Yi-hua2   

  1. 1.Department of Computer Science and Technology, Nanjing Audit University, Nanjing Jiangsu 210029, China; 2.Institute of Information Intelligence and Decision Optimization,Zhejiang University of Technology,Hangzhou Zhejiang 310014, China
  • Online:2005-02-01 Published:2005-02-01

摘要: 模糊K Prototypes(FKP)算法融合了K Means和K Modes对数值型和符号型数据的处理方法,适合于混合类型数据的聚类分析。同时,模糊技术使得FKP适合于处理含有噪声和缺少数据的数据库。但是,在使用FCM(FuzzyC Meansalgorithm)或FKP算法时,如何选取加权指数α仍是悬而未决的问题。许多研究者基于他们的实验结果给出FCM中的最佳加权指数可能位于区间 [1. 5,2. 5],本文则提出了一个FKP中加权指数的探寻算法。在多个实际数据集上的实验结果表明,为进行有效的聚类,FKP中加权指数应该小于 1. 5。

关键词: 加权指数, FKP算法, 聚类有效性

Abstract: Fuzzy K-Prototypes(FKP) algorithm integrating K-Means and K-Modes algorithm is suited for clustering mixed numeric and categorical valued data. The use of fuzzy techniques makes it robust against noise and missing values in the databases. But, it is an open problem how to select an appropriate weighting exponent α when run FCM(Fuzzy C-Means algorithm) or FKP. Some researchers have suggested that the best choice for α in FCM be probably in the interval \ based on their experimental results. In this paper, the algorithm for searching suitable α in FKP was presented. The experimental results on several real datasets show that the valid clustering can be achieved when α is under 1.5.

Key words: weighting exponent, FKP algorithm, clustering validity

中图分类号: