基于改进核模糊C均值类间极大化聚类算法

doi:10.11772/j.issn.1001-9081.2016.07.1981

计算机应用 ›› 2016, Vol. 36 ›› Issue (7): 1981-1987.DOI: 10.11772/j.issn.1001-9081.2016.07.1981

基于改进核模糊C均值类间极大化聚类算法

李斌, 狄岚, 王少华, 于晓瞳

江南大学数字媒体学院, 江苏无锡 214122

收稿日期:2015-12-08 修回日期:2016-03-20 发布日期:2016-07-14 出版日期:2016-07-10
通讯作者: 李斌
作者简介:李斌(1991-),男,江苏泰州人,硕士研究生,主要研究方向:模式识别、数据挖掘;狄岚(1965-),女,江苏南京人,副教授,硕士,CCF会员,主要研究方向:模式识别、数字图像处理;王少华(1991-),男,江西九江人,硕士研究生,主要研究方向:图像处理、数据挖掘;于晓瞳(1989-),男,山东青岛人,硕士研究生,主要研究方向:图像处理、数据挖掘。
基金资助:
江苏省六大人才高峰项目（DZXX-028）；江苏省产学研项目（BY2014023-33）。

Clustering algorithm with maximum distance between clusters based on improved kernel fuzzy C-means

LI Bin, DI Lan, WANG Shaohua, YU Xiaotong

School of Digital Media, Jiangnan University, Wuxi Jiangsu 214122, China

Received:2015-12-08 Revised:2016-03-20 Online:2016-07-14 Published:2016-07-10
Supported by:
This work is partially supported by the Six Talent Peaks Project in Jiangsu Province (DZXX-028), the Industry University Research Project in Jiangsu Province (BY2014023-33).

摘要/Abstract

摘要： 传统的核聚类仅考虑了类内元素的关系而忽略了类间的关系，对边界模糊或边界存在噪声点的数据集进行聚类分析时，会造成边界点的误分问题。为解决上述问题，在核模糊C均值（KFCM）聚类算法的基础上提出了一种基于改进核模糊C均值类间极大化聚类（MKFCM）算法。该算法考虑了类内元素和类间元素的联系，引入了高维特征空间的类间极大惩罚项和调控因子，拉大类中心间的距离，使得边界处的样本得到了较好的划分。在各模拟数据集的实验中，该算法在类中心的偏移距离相对其他算法均有明显降低。在人造高斯数据集的实验中，该算法的精度（ACC）、归一化互信息（NMI）、芮氏指标（RI）指标分别提升至0.9132，0.7575，0.9138。

关键词: 核聚类, 模糊C均值聚类, 类间极大惩罚项, 模糊边界

Abstract: General kernel clustering only concern relationship within clusters while ignoring the issue between clusters. Misclassification easily occurs when clustering data sets with fuzzy and noisy boundaries. To solve this problem, a new clustering algorithm was proposed based on Kernel Fuzzy C-Means (KFCM) clustering algorithm, which was called Kernel Fuzzy C-Means with Maximum distance between clusters (MKFCM). Considering the relationship between within-cluster elements and between-cluster elements, a penalty term representing the distance between centers in feature space and a control parameter were introduced. In this way, the distance between clustering centers was broadened and the samples near boundaries were better classified. Compared with traditional clustering algorithms, the experiments results on simulated data sets show that the proposed algorithm reduces the offset distance of clustering centers obviously. On man-made Gaussian data sets, the ACCuracy (ACC), Normalized Mutual Information (NMI) and Rand Index (RI) of the proposed algorithm were improved to 0.9132, 0.7575 and 0.9138. The proposed algorithm shows its theoretical research significance on data sets with fuzzy and noisy boundaries.

Key words: kernel clustering, Fuzzy C-Means (FCM) clustering, maximum penalty term between centers, fuzzy boundary

中图分类号:

TP391.4
TP18

李斌, 狄岚, 王少华, 于晓瞳. 基于改进核模糊C均值类间极大化聚类算法[J]. 计算机应用, 2016, 36(7): 1981-1987.

LI Bin, DI Lan, WANG Shaohua, YU Xiaotong. Clustering algorithm with maximum distance between clusters based on improved kernel fuzzy C-means[J]. Journal of Computer Applications, 2016, 36(7): 1981-1987.

参考文献

[1] CAI F, CHERKASSKY V. Generalized SMO algorithm for SVM-based multitask learning[J]. IEEE Transactions on Neural Networks and Learning Systems, 2012, 23(6):997-1003.
[2] LIN K P, MING S C. On the design and analysis of the privacy-preserving SVM classifier[J]. IEEE Transactions on Knowledge and Data Engineering, 2011, 23(11):1704-1717.
[3] HALL L O, GOLDGOF D B. Convergence of the single-pass and online fuzzy C-means algorithms[J]. IEEE Transactions on Fuzzy Systems, 2011, 19(4):792-794.
[4] CAMERON A C, MILLER D L. A practitioner's guide to cluster-robust inference[J]. Journal of Human Resources, 2015, 50(2):317-372.
[5] GONG M, LIANG Y, SHI J, et al. Fuzzy C-means clustering with local information and kernel metric for image segmentation[J]. IEEE Transactions on Image Processing, 2013, 22(2):573-584.
[6] ZANG J, LI C. Possibilistic C-means algorithm based on collaborative optimization[C]//Proceedings of the 2014 International Conference on Computer Science and Information Technology. Berlin:Springer, 2014:587-593.
[7] RUBIO E, CASTILLO O. A new proposal for a granular fuzzy C-means algorithm[M]//Design of Intelligent Systems based on Fuzzy Logic, Neural Networks and Nature-Inspired Optimization. Berlin:Springer, 2015:47-57.
[8] RAZA M A, RHEE F C H. Interval type-2 approach to kernel possibilistic C-means clustering[C]//Proceedings of the 2012 IEEE International Conference on Fuzzy Systems. Piscataway, NJ:IEEE, 2012:1-7.
[9] AIZERMAN A, BRAVERMAN E M, ROZONER L I. Theoretical foundations of the potential function method in pattern recognition learning[J]. Automation and Remote Control, 1964, 25(5):821-837.
[10] CORTES C, VAPNIK V. Support-vector networks[J]. Machine Learning, 1995, 20(3):273-297.
[11] XIE X L, BENI G. A validity measure for fuzzy clustering[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1991, 13(8):841-847.
[12] FUKUYAMA F. What is governance?[J]. Governance, 2013, 26(3):347-368.
[13] ZAHID N, LIMOURI M, ESSAID A. A new cluster-validity for fuzzy clustering[J]. Pattern Recognition, 1999, 32(7):1089-1097.
[14] GATH I, GEVA A B. Unsupervised optimal fuzzy clustering[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1989, 11(7):773-780.
[15] ÖZDEMIR D, AKARUN L. Fuzzy algorithms for combined quantization and dithering[J]. IEEE Transactions on Image Processing, 2001, 10(6):923-931.
[16] FERREIRA M R P, CARVALHO F D A T D. Kernel fuzzy C-means with automatic variable weighting[J]. Fuzzy Sets and Systems, 2014, 237(4):1-46.
[17] WU K L, YU J, YANG M S. A novel fuzzy clustering algorithm based on a fuzzy scatter matrix with optimality tests[J]. Pattern Recognition Letters, 2005, 26(5):639-652.
[18] WU K L, YANG M S. Alternative C-means clustering algorithms[J]. Pattern Recognition, 2002, 35(10):2267-2278.
[19] HUANG Z, NG M K. A fuzzy k-modes algorithm for clustering categorical data[J]. IEEE Transactions on Fuzzy Systems, 1999, 7(4):446-452.
[20] LIU J, MOHAMMED J, CARTER J, et al. Distance-based clustering of CGH data[J]. Bioinformatics, 2006, 22(16):1971-1978.
[21] PAL N R, PAL K, KELLER J M, et al. A possibilistic fuzzy C-means clustering algorithm[J]. IEEE Transactions on Fuzzy Systems, 2005, 13(4):517-530.
[22] ZADEH L A. Fuzzy sets[J]. Information and Control, 1965, 8(3):338-353.
[23] BENSAID A M, HALL L O, BEZDEK J C, et al. Validity-guided (re)clustering with applications to image segmentation[J]. IEEE Transactions on Fuzzy Systems, 1996, 4(2):112-123.

基于改进核模糊C均值类间极大化聚类算法

Clustering algorithm with maximum distance between clusters based on improved kernel fuzzy C-means

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

[1]	黄天宇, 李远兴, 陈昊, 郭紫佳, 魏明军. 地空协同场景下加权模糊聚类用户簇划分方法[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1555-1561.
[2]	王梅, 宋晓晖, 刘勇, 许传海. 神经正切核K‑Means聚类[J]. 《计算机应用》唯一官方网站, 2022, 42(11): 3330-3336.
[3]	袁芊芊, 邓洪敏, 王晓航. 基于超像素快速模糊C均值聚类与支持向量机的柑橘病虫害区域分割[J]. 计算机应用, 2021, 41(2): 563-570.
[4]	孙建军, 徐岩. 基于加权改进模糊C均值聚类的欠定混合矩阵估计[J]. 计算机应用, 2020, 40(6): 1769-1773.
[5]	王燕, 何宏科. 基于邻域信息的改进模糊c均值脑MRI分割[J]. 计算机应用, 2020, 40(4): 1196-1201.
[6]	董发志, 丁洪伟, 杨志军, 熊成彪, 张颖婕. 基于遗传算法和模糊C均值聚类的WSN分簇路由算法[J]. 计算机应用, 2019, 39(8): 2359-2365.
[7]	李飞, 杜亮, 任超宏. 基于全局融合的多核概念分解算法[J]. 计算机应用, 2019, 39(4): 1021-1026.
[8]	刘晓明, 沈明玉, 侯整风. 基于Levy飞行的萤火虫模糊聚类算法[J]. 计算机应用, 2019, 39(11): 3257-3262.
[9]	戚攀, 包开阳, 马皛源. 基于模糊C均值聚类及群体智能的WSN分层路由算法[J]. 计算机应用, 2018, 38(7): 1974-1980.
[10]	梁冰, 徐华. 基于改进人工蜂群的核模糊聚类算法[J]. 计算机应用, 2017, 37(9): 2600-2604.
[11]	王昱洁, 蒋薇薇. 基于模糊C均值聚类与单类支持向量机的音频隐写分析方法[J]. 计算机应用, 2016, 36(3): 647-652.
[12]	孙娟王兵杨颖田学东. 聚类分析在肺结节识别中的应用[J]. 计算机应用, 2014, 34(7): 2050-2053.
[13]	杜军乐夏良华齐伟伟豆建斌. 面向健康管理的复杂装备维修模糊聚类[J]. 计算机应用, 2012, 32(07): 2053-2055.
[14]	张一行王霞方世明李晓冬凌峰. 基于空间信息的可能性模糊C均值聚类遥感图像分割[J]. 计算机应用, 2011, 31(11): 3004-3007.
[15]	蒋勇谭怀亮李光文. 基于量子遗传算法的XML聚类方法[J]. 计算机应用, 2011, 31(02): 446-449.