Journal of Computer Applications ›› 2022, Vol. 42 ›› Issue (6): 1914-1921.DOI: 10.11772/j.issn.1001-9081.2021040551
• Data science and technology • Previous Articles Next Articles
Man ZHANG(), Zhengjun ZHANG, Junqi FENG, Tao YAN
Received:
2021-04-12
Revised:
2021-07-22
Accepted:
2021-08-05
Online:
2022-06-22
Published:
2022-06-10
Contact:
Man ZHANG
About author:
ZHANG Zhengjun, born in 1965, Ph. D., associate professor. His research interests include data mining, graphics technology, image processing.Supported by:
通讯作者:
章曼
作者简介:
张正军(1965—),男,江苏阜宁人,副教授,博士,主要研究方向:数据挖掘、图形技术、图像处理基金资助:
CLC Number:
Man ZHANG, Zhengjun ZHANG, Junqi FENG, Tao YAN. Density peak clustering algorithm based on adaptive reachable distance[J]. Journal of Computer Applications, 2022, 42(6): 1914-1921.
章曼, 张正军, 冯俊淇, 严涛. 基于自适应可达距离的密度峰值聚类算法[J]. 《计算机应用》唯一官方网站, 2022, 42(6): 1914-1921.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2021040551
数据集 | 类数 | 实例数 | 维数 |
---|---|---|---|
ThreeCircles | 2 | 3 603 | 2 |
Jain | 2 | 373 | 2 |
Compound | 6 | 399 | 2 |
Pathbased | 10 | 1 484 | 2 |
Tab. 1 Synthetic datasets used in experiments
数据集 | 类数 | 实例数 | 维数 |
---|---|---|---|
ThreeCircles | 2 | 3 603 | 2 |
Jain | 2 | 373 | 2 |
Compound | 6 | 399 | 2 |
Pathbased | 10 | 1 484 | 2 |
数据集 | 算法 | 类数比 | RI | NMI | F1-measure |
---|---|---|---|---|---|
ThreeCircles | DBSCAN | 3/3 | 0.990 7 | 0.962 0 | 0.998 7 |
CFSFDP | 3/3 | 0.594 1 | 0.223 2 | 0.464 5 | |
DADPC | 3/3 | 1.000 0 | 1.000 0 | 1.000 0 | |
ARD-DPC | 3/3 | 1.000 0 | 1.000 0 | 1.000 0 | |
Jain | DBSCAN | 6/2 | 0.752 5 | 0.953 3 | 0.960 5 |
CFSFDP | 2/2 | 0.644 6 | 0.856 2 | 0.876 6 | |
DADPC | 2/2 | 1.000 0 | 1.000 0 | 1.000 0 | |
ARD-DPC | 2/2 | 1.000 0 | 1.000 0 | 1.000 0 | |
Compound | DBSCAN | 5/6 | 0.928 1 | 0.984 4 | 0.968 7 |
CFSFDP | 3/6 | 0.791 7 | 0.846 7 | 0.691 5 | |
DADPC | 2/6 | 0.587 7 | 0.706 8 | 0.627 7 | |
ARD-DPC | 6/6 | 0.953 8 | 0.987 5 | 0.975 3 | |
Pathbased | DBSCAN | 3/3 | 0.696 0 | 0.813 8 | 0.731 1 |
CFSFDP | 3/3 | 0.551 9 | 0.750 9 | 0.662 2 | |
DADPC | 2/3 | 0.528 9 | 0.718 3 | 0.679 3 | |
ARD-DPC | 3/3 | 0.889 8 | 0.962 3 | 0.941 9 |
Tab. 2 Comparison of evaluation indicators of four algorithms on synthetic datasets
数据集 | 算法 | 类数比 | RI | NMI | F1-measure |
---|---|---|---|---|---|
ThreeCircles | DBSCAN | 3/3 | 0.990 7 | 0.962 0 | 0.998 7 |
CFSFDP | 3/3 | 0.594 1 | 0.223 2 | 0.464 5 | |
DADPC | 3/3 | 1.000 0 | 1.000 0 | 1.000 0 | |
ARD-DPC | 3/3 | 1.000 0 | 1.000 0 | 1.000 0 | |
Jain | DBSCAN | 6/2 | 0.752 5 | 0.953 3 | 0.960 5 |
CFSFDP | 2/2 | 0.644 6 | 0.856 2 | 0.876 6 | |
DADPC | 2/2 | 1.000 0 | 1.000 0 | 1.000 0 | |
ARD-DPC | 2/2 | 1.000 0 | 1.000 0 | 1.000 0 | |
Compound | DBSCAN | 5/6 | 0.928 1 | 0.984 4 | 0.968 7 |
CFSFDP | 3/6 | 0.791 7 | 0.846 7 | 0.691 5 | |
DADPC | 2/6 | 0.587 7 | 0.706 8 | 0.627 7 | |
ARD-DPC | 6/6 | 0.953 8 | 0.987 5 | 0.975 3 | |
Pathbased | DBSCAN | 3/3 | 0.696 0 | 0.813 8 | 0.731 1 |
CFSFDP | 3/3 | 0.551 9 | 0.750 9 | 0.662 2 | |
DADPC | 2/3 | 0.528 9 | 0.718 3 | 0.679 3 | |
ARD-DPC | 3/3 | 0.889 8 | 0.962 3 | 0.941 9 |
数据集 | 类数 | 实例数 | 维数 |
---|---|---|---|
Wine | 3 | 178 | 13 |
Glass | 6 | 214 | 9 |
Heart | 2 | 303 | 13 |
Breast | 2 | 277 | 9 |
Iris | 3 | 150 | 4 |
Pima | 2 | 768 | 8 |
Tab. 3 UCI datasets used in experiments
数据集 | 类数 | 实例数 | 维数 |
---|---|---|---|
Wine | 3 | 178 | 13 |
Glass | 6 | 214 | 9 |
Heart | 2 | 303 | 13 |
Breast | 2 | 277 | 9 |
Iris | 3 | 150 | 4 |
Pima | 2 | 768 | 8 |
数据集 | 算法 | 类数比 | NMI | RI | F1-measure |
---|---|---|---|---|---|
Wine | DBSCAN | 12/3 | 0.150 1 | 0.416 4 | 0.463 0 |
CFSFDP | 2/3 | 0.227 1 | 0.640 6 | 0.618 4 | |
DADPC | 2/3 | 0.072 0 | 0.367 7 | 0.499 0 | |
ARD-DPC | 2/3 | 0.399 9 | 0.689 0 | 0.642 6 | |
Glass | DBSCAN | 9/6 | 0.422 5 | 0.606 4 | 0.510 6 |
CFSFDP | 2/6 | 0.241 9 | 0.506 0 | 0.482 3 | |
DADPC | 2/6 | 0.034 4 | 0.276 5 | 0.417 3 | |
ARD-DPC | 3/6 | 0.456 5 | 0.631 3 | 0.526 6 | |
Heart | DBSCAN | 10/2 | 0.117 5 | 0.524 5 | 0.253 7 |
CFSFDP | 2/2 | 0.136 8 | 0.575 7 | 0.621 7 | |
DADPC | 2/2 | 0.114 9 | 0.570 6 | 0.604 4 | |
ARD-DPC | 2/2 | 0.101 1 | 0.503 4 | 0.623 2 | |
Breast | DBSCAN | 7/2 | 0.042 7 | 0.488 4 | 0.400 8 |
CFSFDP | 2/2 | 0.014 2 | 0.587 7 | 0.738 6 | |
DADPC | 2/2 | 0.032 5 | 0.665 7 | 0.784 3 | |
ARD-DPC | 3/2 | 0.072 6 | 0.647 5 | 0.781 9 | |
Iris | DBSCAN | 2/3 | 0.733 7 | 0.776 3 | 0.746 2 |
CFSFDP | 3/3 | 0.805 7 | 0.892 3 | 0.840 4 | |
DADPC | 3/3 | 0.805 7 | 0.892 3 | 0.840 4 | |
ARD-DPC | 3/3 | 0.806 4 | 0.916 0 | 0.866 0 | |
Pima | DBSCAN | 1/2 | 0.007 0 | 0.547 4 | 0.702 6 |
CFSFDP | 1/2 | 0.004 2 | 0.545 0 | 0.705 5 | |
DADPC | 2/2 | 0.003 2 | 0.526 8 | 0.653 5 | |
ARD-DPC | 1/2 | 0.003 6 | 0.542 7 | 0.683 3 |
Tab. 4 Comparison of evaluation metrics of four algorithms on UCI datasets
数据集 | 算法 | 类数比 | NMI | RI | F1-measure |
---|---|---|---|---|---|
Wine | DBSCAN | 12/3 | 0.150 1 | 0.416 4 | 0.463 0 |
CFSFDP | 2/3 | 0.227 1 | 0.640 6 | 0.618 4 | |
DADPC | 2/3 | 0.072 0 | 0.367 7 | 0.499 0 | |
ARD-DPC | 2/3 | 0.399 9 | 0.689 0 | 0.642 6 | |
Glass | DBSCAN | 9/6 | 0.422 5 | 0.606 4 | 0.510 6 |
CFSFDP | 2/6 | 0.241 9 | 0.506 0 | 0.482 3 | |
DADPC | 2/6 | 0.034 4 | 0.276 5 | 0.417 3 | |
ARD-DPC | 3/6 | 0.456 5 | 0.631 3 | 0.526 6 | |
Heart | DBSCAN | 10/2 | 0.117 5 | 0.524 5 | 0.253 7 |
CFSFDP | 2/2 | 0.136 8 | 0.575 7 | 0.621 7 | |
DADPC | 2/2 | 0.114 9 | 0.570 6 | 0.604 4 | |
ARD-DPC | 2/2 | 0.101 1 | 0.503 4 | 0.623 2 | |
Breast | DBSCAN | 7/2 | 0.042 7 | 0.488 4 | 0.400 8 |
CFSFDP | 2/2 | 0.014 2 | 0.587 7 | 0.738 6 | |
DADPC | 2/2 | 0.032 5 | 0.665 7 | 0.784 3 | |
ARD-DPC | 3/2 | 0.072 6 | 0.647 5 | 0.781 9 | |
Iris | DBSCAN | 2/3 | 0.733 7 | 0.776 3 | 0.746 2 |
CFSFDP | 3/3 | 0.805 7 | 0.892 3 | 0.840 4 | |
DADPC | 3/3 | 0.805 7 | 0.892 3 | 0.840 4 | |
ARD-DPC | 3/3 | 0.806 4 | 0.916 0 | 0.866 0 | |
Pima | DBSCAN | 1/2 | 0.007 0 | 0.547 4 | 0.702 6 |
CFSFDP | 1/2 | 0.004 2 | 0.545 0 | 0.705 5 | |
DADPC | 2/2 | 0.003 2 | 0.526 8 | 0.653 5 | |
ARD-DPC | 1/2 | 0.003 6 | 0.542 7 | 0.683 3 |
1 | MAcQUEEN J. Some methods for classification and analysis of multivariate observations[C]// Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability. Berkeley: University of California Press, 1967: 281-297. |
2 | KAUFMAN L, ROUSSEEUW P. Clustering by means of medoids[M]// DOGEE Y. Statistical Data Analysis Based on the L1-norm and Related Methods. Amsterdam: Elsevier Science Publishing Company, 1987: 405-416. |
3 | ZHANG T, RAMAKRISHNAN R, LIVNY M. BIRCH: an efficient data clustering method for very large databases[C]// Proceedings of the 1996 ACM SIGMOID International Conference on Management of Data. New York: ACM, 1996: 103-114. 10.1145/235968.233324 |
4 | KARPIS G, HAN E H, KUMAR V. Chameleon: hierarchical clustering using dynamic modeling[J]. Computer, 1999, 32(8):68-75. 10.1109/2.781637 |
5 | ESTER M, KRIEGEL H P, SANDER J, et al. A density-based algorithm for discovering clusters in large spatial databases with noise[C]// Proceedings of the 2nd International Conference on Knowledge Discovery and Data Ming. Menlo Park, CA: AAAI, 1996: 226-231. 10.1109/icde.1998.655795 |
6 | ANKERST M, BREUNING M M, KRIEGEL H P, et al. OPTICS: ordering points to identify the clustering structure[C]// Proceedings of the 1999 ACM SGMOD International Conference on Management of Data. New York: ACM, 1999: 49-60. 10.1145/304181.304187 |
7 | RODRIGUEZ A, LAIO A. Clustering by fast search and find of density peaks[J]. Science, 2014, 344(6191): 1492-1496. 10.1126/science.1242072 |
8 | HOU J, PELILLO M. A new density kernel in density peak based clustering[C]// Proceedings of the 23rd International Conference on Pattern Recognition. Piscataway: IEEE, 2016: 468-473. 10.1109/icpr.2016.7899678 |
9 | MEHMOOD R, ZHANG G Z, BIE R F, et al. Clustering by fast search and find of density peaks via heat diffusion[J]. Neurocomputing, 2016, 208: 210-217. 10.1016/j.neucom.2016.01.102 |
10 | 谢国伟,钱雪忠,周世兵. 基于非参数核密度估计的密度峰值聚类算法[J]. 计算机应用研究, 2018, 35(10):2956-2959. 10.3969/j.issn.1001-3695.2018.10.018 |
XIE G W, QIAN X Z, ZHOU S B. Density peak clustering algorithm based on non-parametric kernel density estimation[J]. Application Research of Computers, 2018, 35(10): 2956-2959. 10.3969/j.issn.1001-3695.2018.10.018 | |
11 | 李涛,葛洪伟,苏树智. 基于密度自适应距离的密度峰聚类[J]. 小型微型计算机系统, 2017, 38(6):1347-1352. 10.3969/j.issn.1000-1220.2017.06.032 |
LI T, GE H W, SU S Z. Density peaks clustering based on density adaptive distance[J]. Journal of Chinese Computer Systems. 2017, 38(6): 1347-1352. 10.3969/j.issn.1000-1220.2017.06.032 | |
12 | PARZEN E. On estimation of a probability density function and mode[J]. The Annals of Mathematical Statistics, 1962, 33(3): 1065-1076. 10.1214/aoms/1177704472 |
13 | SILVERMAN B W. Density Estimation for Statistics and Data Analysis[M]. Boca Raton: Chapman and Hall, 1986: 34-117. 10.1007/978-1-4899-3324-9_6 |
14 | 宋宇辰,宋飞燕,孟海东. 基于密度复杂簇聚类算法研究与实现[J]. 计算机工程与应用, 2007, 43(35):162-165. 10.3321/j.issn:1002-8331.2007.35.049 |
SONG Y C, SONG F Y, MENG H D. Research and implementation of density based clustering algorithm for complex clusters[J]. Computer Engineering and Applications, 2007, 43(35): 162-165. 10.3321/j.issn:1002-8331.2007.35.049 | |
15 | DETONE D, MALISIEWICZ T, RABINOVICH A. SuperPoint: self-supervised interest point detection and description[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Piscataway: IEEE, 2018: 337-349. 10.1109/cvprw.2018.00060 |
16 | DUA D, GRAFF C. UCI machine learning repository[DS/OL]. [2021-02-20].. |
17 | NGUYEN T P Q, KUO R J. Partition-and-merge based fuzzy genetic clustering algorithm for categorical data[J]. Applied Soft Computing, 2019, 75: 254-264. 10.1016/j.asoc.2018.11.028 |
18 | MANNING C D, RAGHAVAN P, SCHÜTZE H. Introduction to Information Retrieval[M]. Cambridge: Cambridge University Press, 2008: 356-360. 10.1017/cbo9780511809071 |
[1] | Yu DING, Hanlin ZHANG, Rong LUO, Hua MENG. Fuzzy clustering algorithm based on belief subcluster cutting [J]. Journal of Computer Applications, 2024, 44(4): 1128-1138. |
[2] | Chenghao YANG, Jie HU, Hongjun WANG, Bo PENG. Incomplete multi-view clustering algorithm based on attention mechanism [J]. Journal of Computer Applications, 2024, 44(12): 3784-3789. |
[3] | Yi WANG, Shenglei PEI, Yu WANG. Indoor positioning method of multi-fingerprint database based on channel state information and K-means-SVR [J]. Journal of Computer Applications, 2023, 43(5): 1636-1640. |
[4] | Ran ZHAI, Xuebin CHEN, Guopeng ZHANG, Langtao PEI, Zheng MA. Improved K-anonymity privacy protection algorithm based on different sensitivities [J]. Journal of Computer Applications, 2023, 43(5): 1497-1503. |
[5] | Xiaofei WANG, Shengli BAO, Jionghuan CHEN. Missing value attention clustering algorithm based on latent factor model in subspace [J]. Journal of Computer Applications, 2023, 43(12): 3772-3778. |
[6] | Yuhang LI, Yuli YANG, Yao MA, Dan YU, Yongle CHEN. Text adversarial example generation method based on BERT model [J]. Journal of Computer Applications, 2023, 43(10): 3093-3098. |
[7] | WANG Jindong, LI Qiang. Improved practical Byzantine fault tolerance consensus algorithm based on Raft algorithm [J]. Journal of Computer Applications, 2023, 43(1): 122-129. |
[8] | SUN Zeqiang, CHEN Bingcai, CUI Xiaobo, WANG Lei, LU Yanuo. Strip steel surface defect detection by YOLOv5 algorithm fusing frequency domain attention mechanism and decoupled head [J]. Journal of Computer Applications, 2023, 43(1): 242-249. |
[9] | Xuewen LIU, Jikui WANG, Zhengguo YANG, Qiang LI, Jihai YI, Bing LI, Feiping NIE. Imbalanced data classification algorithm based on ball cluster partitioning and undersampling with density peak optimization [J]. Journal of Computer Applications, 2022, 42(5): 1455-1463. |
[10] | Huanhuan ZHOU, Bochuan ZHENG, Zheng ZHANG, Qi ZHANG. Density peak clustering algorithm based on adaptive nearest neighbor parameters [J]. Journal of Computer Applications, 2022, 42(5): 1464-1471. |
[11] | Jie DU, Yan MA, Hui HUANG. Clustering algorithm based on local gravity and distance [J]. Journal of Computer Applications, 2022, 42(5): 1472-1479. |
[12] | WANG Jiarui, TAN Guoping, ZHOU Siyuan. Clustered wireless federated learning algorithm in high-speed internet of vehicles scenes [J]. Journal of Computer Applications, 2021, 41(6): 1546-1550. |
[13] | GUO Jia, HAN Litao, SUN Xianlong, ZHOU Lijuan. Comparative density peaks clustering algorithm with automatic determination of clustering center [J]. Journal of Computer Applications, 2021, 41(3): 738-744. |
[14] | LYU Jia, XIAN Yan. Co-training algorithm combining improved density peak clustering and shared subspace [J]. Journal of Computer Applications, 2021, 41(3): 686-693. |
[15] | HU Runyan, LI Cuiran. Clustering algorithm of energy harvesting wireless sensor network based on fuzzy control [J]. Journal of Computer Applications, 2020, 40(9): 2691-2697. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||