《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (8): 2450-2460.DOI: 10.11772/j.issn.1001-9081.2021061083
• 数据科学与技术 • 上一篇
收稿日期:
2021-06-24
修回日期:
2021-12-07
接受日期:
2021-12-17
发布日期:
2022-01-25
出版日期:
2022-08-10
通讯作者:
赵兴旺
作者简介:
陈延伟(1996—),男,山东潍坊人,硕士研究生,CCF会员,主要研究方向:数据挖掘、机器学习;基金资助:
Yanwei CHEN1,2, Xingwang ZHAO1,2()
Received:
2021-06-24
Revised:
2021-12-07
Accepted:
2021-12-17
Online:
2022-01-25
Published:
2022-08-10
Contact:
Xingwang ZHAO
About author:
CHEN Yanwei, born in 1996, M. S. candidate. His research interests include data mining, machine learning.Supported by:
摘要:
密度聚类算法因具有对噪声鲁棒、能够发现任意形状的类等优点,得到了广泛的应用。然而,在实际应用中,这种算法面临着由于数据集中不同类的密度分布不均,且类与类之间的边界难以区分等导致聚类效果较差的问题。为解决以上问题,提出一种基于边界点检测的变密度聚类算法(VDCBD)。首先,基于给出的相对密度度量方法识别变密度类之间的边界点,以此增强相邻类的可分性;其次,对非边界区域的点进行聚类以找到数据集的核心类结构;接着,依据高密度近邻分配原则将检测到的边界点分配到相应的核心类结构中;最后,基于类结构信息识别数据集中的噪声点。在人造数据集和UCI数据集上与K-means、基于密度的噪声应用空间聚类(DBSCAN)算法、密度峰值聚类算法(DPCA)、有效识别密度主干的聚类(CLUB)算法、边界剥离聚类(BP)算法进行了比较分析。实验结果表明,所提算法可以有效解决类分布密度不均、边界难以区分的问题,并在调整兰德指数(ARI)、标准化互信息(NMI)、F度量(FM)、准确度(ACC)评价指标上优于已有算法;在运行效率分析中,当数据规模较大时,VDCBD运行效率高于DPCA、CLUB和BP算法。
中图分类号:
陈延伟, 赵兴旺. 基于边界点检测的变密度聚类算法[J]. 计算机应用, 2022, 42(8): 2450-2460.
Yanwei CHEN, Xingwang ZHAO. Varied density clustering algorithm based on border point detection[J]. Journal of Computer Applications, 2022, 42(8): 2450-2460.
数据集 | 样本数 | 特征数 | 类别数 |
---|---|---|---|
Flame[ | 240 | 2 | 2 |
Jain[ | 373 | 2 | 2 |
Aggregation[ | 788 | 2 | 7 |
D31[ | 3 100 | 2 | 31 |
T4[ | 8 000 | 2 | 6 |
T8[ | 8 000 | 2 | 8 |
S1 | 800 | 2 | 3 |
S2 | 8 000 | 2 | 3 |
表1 人造数据集描述
Tab. 1 Description of artificial datasets
数据集 | 样本数 | 特征数 | 类别数 |
---|---|---|---|
Flame[ | 240 | 2 | 2 |
Jain[ | 373 | 2 | 2 |
Aggregation[ | 788 | 2 | 7 |
D31[ | 3 100 | 2 | 31 |
T4[ | 8 000 | 2 | 6 |
T8[ | 8 000 | 2 | 8 |
S1 | 800 | 2 | 3 |
S2 | 8 000 | 2 | 3 |
算法 | Flame | Jain | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
ARI | NMI | FM | ACC | 参数 | ARI | NMI | FM | ACC | 参数 | |
K-means | 0.515 8 | 0.803 5 | 0.602 4 | 0.466 7 | 2 | 0.576 7 | 0.527 4 | 0.886 7 | 0.882 0 | 2 |
DBSCAN | 0.938 8 | 0.866 5 | 0.983 1 | 0.987 5 | 0.09/8 | 0.975 8 | 0.928 1 | 0.987 3 | 1.0000 | 0.08/2 |
DPCA | 0.988 1 | 0.970 6 | 0.987 9 | 0.991 6 | 2.8 | 0.514 6 | 0.505 0 | 0.868 1 | 0.860 6 | 2 |
CLUB | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 9~19 | 0.713 3 | 0.553 5 | 0.838 7 | 0.943 7 | 7 |
BP | 0.955 0 | 0.908 0 | 0.979 1 | 0.991 7 | 无 | 0.230 2 | 0.451 3 | 0.529 3 | 0.924 9 | 无 |
VDCBD | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 9~15 | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 19 |
算法 | Aggregation | D31 | ||||||||
ARI | NMI | FM | ACC | 参数 | ARI | NMI | FM | ACC | 参数 | |
K-means | 0.776 7 | 0.852 1 | 0.860 5 | 0.904 8 | 7 | 0.9535 | 0.9676 | 0.977 1 | 0.966 0 | 31 |
DBSCAN | 0.977 9 | 0.968 1 | 0.989 7 | 0.991 1 | 0.04/6 | 0.807 8 | 0.913 2 | 0.881 4 | 0.828 7 | 0.04/38 |
DPCA | 0.991 3 | 0.986 9 | 0.994 9 | 0.994 9 | 0.14 | 0.934 5 | 0.956 8 | 0.967 4 | 0.967 4 | 0.6 |
CLUB | 0.984 3 | 0.978 1 | 0.993 6 | 0.993 6 | 26 | 0.939 6 | 0.959 3 | 0.970 0 | 0.970 3 | 25~26 |
BP | 0.992 7 | 0.988 3 | 0.944 3 | 0.996 2 | 无 | 0.908 6 | 0.939 1 | 0.911 6 | 0.938 4 | 无 |
VDCBD | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 25 | 0.947 8 | 0.965 7 | 0.9790 | 0.9791 | 31 |
表2 6种算法在4个人造数据集上的聚类结果
Tab. 2 Clustering results of 6 algorithms on 4 artificial datasets
算法 | Flame | Jain | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
ARI | NMI | FM | ACC | 参数 | ARI | NMI | FM | ACC | 参数 | |
K-means | 0.515 8 | 0.803 5 | 0.602 4 | 0.466 7 | 2 | 0.576 7 | 0.527 4 | 0.886 7 | 0.882 0 | 2 |
DBSCAN | 0.938 8 | 0.866 5 | 0.983 1 | 0.987 5 | 0.09/8 | 0.975 8 | 0.928 1 | 0.987 3 | 1.0000 | 0.08/2 |
DPCA | 0.988 1 | 0.970 6 | 0.987 9 | 0.991 6 | 2.8 | 0.514 6 | 0.505 0 | 0.868 1 | 0.860 6 | 2 |
CLUB | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 9~19 | 0.713 3 | 0.553 5 | 0.838 7 | 0.943 7 | 7 |
BP | 0.955 0 | 0.908 0 | 0.979 1 | 0.991 7 | 无 | 0.230 2 | 0.451 3 | 0.529 3 | 0.924 9 | 无 |
VDCBD | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 9~15 | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 19 |
算法 | Aggregation | D31 | ||||||||
ARI | NMI | FM | ACC | 参数 | ARI | NMI | FM | ACC | 参数 | |
K-means | 0.776 7 | 0.852 1 | 0.860 5 | 0.904 8 | 7 | 0.9535 | 0.9676 | 0.977 1 | 0.966 0 | 31 |
DBSCAN | 0.977 9 | 0.968 1 | 0.989 7 | 0.991 1 | 0.04/6 | 0.807 8 | 0.913 2 | 0.881 4 | 0.828 7 | 0.04/38 |
DPCA | 0.991 3 | 0.986 9 | 0.994 9 | 0.994 9 | 0.14 | 0.934 5 | 0.956 8 | 0.967 4 | 0.967 4 | 0.6 |
CLUB | 0.984 3 | 0.978 1 | 0.993 6 | 0.993 6 | 26 | 0.939 6 | 0.959 3 | 0.970 0 | 0.970 3 | 25~26 |
BP | 0.992 7 | 0.988 3 | 0.944 3 | 0.996 2 | 无 | 0.908 6 | 0.939 1 | 0.911 6 | 0.938 4 | 无 |
VDCBD | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 25 | 0.947 8 | 0.965 7 | 0.9790 | 0.9791 | 31 |
数据集 | 样本数 | 特征数 | 类别数 |
---|---|---|---|
Iris | 150 | 4 | 3 |
Wine | 178 | 13 | 3 |
Leaf | 340 | 16 | 30 |
Ecoli | 336 | 8 | 8 |
Seeds | 210 | 7 | 3 |
Segmentation | 2 | 19 | 7 |
Wall-Following | 5 | 24 | 2 |
Pendigits | 10 | 16 | 10 |
表3 真实数据集描述
Tab. 3 Description of real datasets
数据集 | 样本数 | 特征数 | 类别数 |
---|---|---|---|
Iris | 150 | 4 | 3 |
Wine | 178 | 13 | 3 |
Leaf | 340 | 16 | 30 |
Ecoli | 336 | 8 | 8 |
Seeds | 210 | 7 | 3 |
Segmentation | 2 | 19 | 7 |
Wall-Following | 5 | 24 | 2 |
Pendigits | 10 | 16 | 10 |
算法 | Iris | Wine | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
ARI | NMI | FM | ACC | 参数 | ARI | NMI | FM | ACC | 参数 | |
K-means | 0.716 3 | 0.741 9 | 0.885 2 | 0.886 7 | 3 | 0.914 9 | 0.892 6 | 0.996 1 | 0.966 3 | 3 |
DBSCAN | 0.624 6 | 0.663 3 | 0.861 6 | 0.860 0 | 0.13/9 | 0.537 8 | 0.598 2 | 0.805 3 | 0.814 6 | 0.51/23 |
DPCA | 0.885 7 | 0.864 1 | 0.959 9 | 0.960 0 | 0.2 | 0.647 1 | 0.695 9 | 0.865 3 | 0.870 8 | 2 |
CLUB | 0.714 5 | 0.735 8 | 0.868 4 | 0.953 3 | 5 | 0.421 4 | 0.537 4 | 0.716 0 | 0.646 0 | 7 |
BP | 0.558 8 | 0.693 1 | 0.755 4 | 0.680 0 | 无 | 0.350 6 | 0.370 4 | 0.549 6 | 0.634 8 | 无 |
VDCBD | 0.903 4 | 0.870 5 | 0.959 9 | 0.960 0 | 10 | 0.836 8 | 0.825 2 | 0.943 2 | 0.943 2 | 20 |
算法 | Leaf | Ecoli | ||||||||
ARI | NMI | FM | ACC | 参数 | ARI | NMI | FM | ACC | 参数 | |
K-means | 0.414 7 | 0.726 7 | 0.585 6 | 0.588 2 | 30 | 0.721 6 | 0.689 1 | 0.790 0 | 0.800 6 | 8 |
DBSCAN | 0.220 0 | 0.755 6 | 0.377 2 | 0.962 3 | 0.12/1 | 0.649 1 | 0.586 7 | 0.744 2 | 0.732 1 | 0.32/30 |
DPCA | 0.285 1 | 0.660 6 | 0.441 4 | 0.432 4 | 30 | 0.464 5 | 0.625 2 | 0.714 6 | 0.815 5 | 0.4 |
CLUB | 0.228 8 | 0.670 2 | 0.535 7 | 0.600 | 1 | 0.729 8 | 0.693 5 | 0.785 0 | 0.779 7 | 15 |
BP | 0.156 7 | 0.532 3 | 0.291 7 | 0.235 3 | 无 | 0.663 5 | 0.634 7 | 0.771 4 | 0.732 1 | 无 |
VDCBD | 0.420 3 | 0.744 3 | 0.592 1 | 0.661 8 | 3 | 0.769 9 | 0.744 8 | 0.835 5 | 0.839 3 | 7 |
算法 | Seeds | Segmentation | ||||||||
ARI | NMI | FM | ACC | 参数 | ARI | NMI | FM | ACC | 参数 | |
K-means | 0.704 8 | 0.674 3 | 0.890 5 | 0.890 5 | 3 | 0.500 4 | 0.637 2 | 0.683 4 | 0.709 5 | 7 |
DBSCAN | 0.584 3 | 0.578 7 | 0.831 5 | 0.838 1 | 0.34/43 | 0.526 6 | 0.667 2 | 0.706 5 | 0.747 6 | 1.38/5 |
DPCA | 0.707 5 | 0.679 6 | 0.888 3 | 0.890 5 | 0.7 | 0.502 3 | 0.625 8 | 0.700 8 | 0.671 4 | 1.5 |
CLUB | 0.750 5 | 0.705 4 | 0.905 0 | 0.914 3 | 5 | 0.502 7 | 0.625 4 | 0.698 2 | 0.680 9 | 7 |
BP | 0.616 8 | 0.609 0 | 0.743 2 | 0.847 6 | 无 | 0.100 8 | 0.355 0 | 0.425 7 | 0.300 0 | 无 |
VDCBD | 0.769 7 | 0.694 3 | 0.904 2 | 0.904 8 | 14 | 0.539 7 | 0.656 1 | 0.713 9 | 0.738 1 | 11 |
算法 | Wall-Following | Pendigits | ||||||||
ARI | NMI | FM | ACC | 参数 | ARI | NMI | FM | ACC | 参数 | |
K-means | 0.359 0 | 0.261 9 | 0.800 0 | 0.800 2 | 2 | 0.642 1 | 0.719 6 | 0.765 4 | 0.775 2 | 10 |
DBSCAN | 0.076 0 | 0.155 3 | 0.477 8 | 0.528 2 | 0.77/50 | 0.630 2 | 0.723 6 | 0.747 6 | 0.868 8 | 0.32/1 |
DPCA | 0.051 0 | 0.102 0 | 0.412 2 | 0.479 8 | 0.33 | 0.597 4 | 0.732 2 | 0.738 5 | 0.748 6 | 0.31 |
CLUB | 0.002 0 | 0.219 0 | 0.236 4 | 0.791 2 | 1 | 0.675 0 | 0.785 9 | 0.767 6 | 0.828 5 | 10 |
BP | -0.014 5 | 0.055 3 | 0.478 5 | 0.437 1 | 无 | 0.725 9 | 0.821 4 | 0.757 8 | 0.741 0 | 无 |
VDCBD | 0.066 5 | 0.192 2 | 0.487 1 | 0.564 9 | 14 | 0.666 8 | 0.775 2 | 0.778 6 | 0.850 0 | 7 |
表4 6种算法在8个真实数据集上的聚类结果
Tab. 4 Clustering results of 6 algorithms on 8 real datasets
算法 | Iris | Wine | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
ARI | NMI | FM | ACC | 参数 | ARI | NMI | FM | ACC | 参数 | |
K-means | 0.716 3 | 0.741 9 | 0.885 2 | 0.886 7 | 3 | 0.914 9 | 0.892 6 | 0.996 1 | 0.966 3 | 3 |
DBSCAN | 0.624 6 | 0.663 3 | 0.861 6 | 0.860 0 | 0.13/9 | 0.537 8 | 0.598 2 | 0.805 3 | 0.814 6 | 0.51/23 |
DPCA | 0.885 7 | 0.864 1 | 0.959 9 | 0.960 0 | 0.2 | 0.647 1 | 0.695 9 | 0.865 3 | 0.870 8 | 2 |
CLUB | 0.714 5 | 0.735 8 | 0.868 4 | 0.953 3 | 5 | 0.421 4 | 0.537 4 | 0.716 0 | 0.646 0 | 7 |
BP | 0.558 8 | 0.693 1 | 0.755 4 | 0.680 0 | 无 | 0.350 6 | 0.370 4 | 0.549 6 | 0.634 8 | 无 |
VDCBD | 0.903 4 | 0.870 5 | 0.959 9 | 0.960 0 | 10 | 0.836 8 | 0.825 2 | 0.943 2 | 0.943 2 | 20 |
算法 | Leaf | Ecoli | ||||||||
ARI | NMI | FM | ACC | 参数 | ARI | NMI | FM | ACC | 参数 | |
K-means | 0.414 7 | 0.726 7 | 0.585 6 | 0.588 2 | 30 | 0.721 6 | 0.689 1 | 0.790 0 | 0.800 6 | 8 |
DBSCAN | 0.220 0 | 0.755 6 | 0.377 2 | 0.962 3 | 0.12/1 | 0.649 1 | 0.586 7 | 0.744 2 | 0.732 1 | 0.32/30 |
DPCA | 0.285 1 | 0.660 6 | 0.441 4 | 0.432 4 | 30 | 0.464 5 | 0.625 2 | 0.714 6 | 0.815 5 | 0.4 |
CLUB | 0.228 8 | 0.670 2 | 0.535 7 | 0.600 | 1 | 0.729 8 | 0.693 5 | 0.785 0 | 0.779 7 | 15 |
BP | 0.156 7 | 0.532 3 | 0.291 7 | 0.235 3 | 无 | 0.663 5 | 0.634 7 | 0.771 4 | 0.732 1 | 无 |
VDCBD | 0.420 3 | 0.744 3 | 0.592 1 | 0.661 8 | 3 | 0.769 9 | 0.744 8 | 0.835 5 | 0.839 3 | 7 |
算法 | Seeds | Segmentation | ||||||||
ARI | NMI | FM | ACC | 参数 | ARI | NMI | FM | ACC | 参数 | |
K-means | 0.704 8 | 0.674 3 | 0.890 5 | 0.890 5 | 3 | 0.500 4 | 0.637 2 | 0.683 4 | 0.709 5 | 7 |
DBSCAN | 0.584 3 | 0.578 7 | 0.831 5 | 0.838 1 | 0.34/43 | 0.526 6 | 0.667 2 | 0.706 5 | 0.747 6 | 1.38/5 |
DPCA | 0.707 5 | 0.679 6 | 0.888 3 | 0.890 5 | 0.7 | 0.502 3 | 0.625 8 | 0.700 8 | 0.671 4 | 1.5 |
CLUB | 0.750 5 | 0.705 4 | 0.905 0 | 0.914 3 | 5 | 0.502 7 | 0.625 4 | 0.698 2 | 0.680 9 | 7 |
BP | 0.616 8 | 0.609 0 | 0.743 2 | 0.847 6 | 无 | 0.100 8 | 0.355 0 | 0.425 7 | 0.300 0 | 无 |
VDCBD | 0.769 7 | 0.694 3 | 0.904 2 | 0.904 8 | 14 | 0.539 7 | 0.656 1 | 0.713 9 | 0.738 1 | 11 |
算法 | Wall-Following | Pendigits | ||||||||
ARI | NMI | FM | ACC | 参数 | ARI | NMI | FM | ACC | 参数 | |
K-means | 0.359 0 | 0.261 9 | 0.800 0 | 0.800 2 | 2 | 0.642 1 | 0.719 6 | 0.765 4 | 0.775 2 | 10 |
DBSCAN | 0.076 0 | 0.155 3 | 0.477 8 | 0.528 2 | 0.77/50 | 0.630 2 | 0.723 6 | 0.747 6 | 0.868 8 | 0.32/1 |
DPCA | 0.051 0 | 0.102 0 | 0.412 2 | 0.479 8 | 0.33 | 0.597 4 | 0.732 2 | 0.738 5 | 0.748 6 | 0.31 |
CLUB | 0.002 0 | 0.219 0 | 0.236 4 | 0.791 2 | 1 | 0.675 0 | 0.785 9 | 0.767 6 | 0.828 5 | 10 |
BP | -0.014 5 | 0.055 3 | 0.478 5 | 0.437 1 | 无 | 0.725 9 | 0.821 4 | 0.757 8 | 0.741 0 | 无 |
VDCBD | 0.066 5 | 0.192 2 | 0.487 1 | 0.564 9 | 14 | 0.666 8 | 0.775 2 | 0.778 6 | 0.850 0 | 7 |
1 | XU R, WUNSCH D C. Survey of clustering algorithms[J]. IEEE Transactions on Neural Networks, 2005, 16(3): 645-678. 10.1109/tnn.2005.845141 |
2 | AGGARWAL C C, REDDY C K. Data Clustering: Algorithms and Applications[M]. Boca Raton: CRC Press, 2014: 111-124. |
3 | 王垚,孙国梓.基于聚类和实例硬度的入侵检测过采样方法[J].计算机应用, 2021, 41(6): 1709-1714. |
WANG Y, SUN G Z. Oversampling method for intrusion detection based on clustering and instance hardness[J]. Journal of Computer Applications, 2021, 41(6): 1709-1714. | |
4 | 章永来,周耀鉴.聚类算法综述[J].计算机应用, 2019, 39(7): 1869-1882. 10.11772/j.issn.1001-9081.2019010174 |
ZHANG Y L, ZHOU Y J. Review of clustering algorithms[J]. Journal of Computer Applications, 2019, 39(7): 1869-1882. 10.11772/j.issn.1001-9081.2019010174 | |
5 | BHATTACHARJEE P, MITRA P. A survey of density based clustering algorithms[J]. Frontiers of Computer Science, 2021, 15(1): No.151308. 10.1007/s11704-019-9059-3 |
6 | ESTER M, KRIEGEL H P, SANDER J, et al. A density-based algorithm for discovering clusters in large spatial databases with noise [C]// Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining. Palo Alto, CA: AAAI Press, 1996: 226-231. |
7 | ANKERST M, BREUNING M M, KRIEGEL H P, et al. OPTICS: ordering points to identify the clustering structure [C]// Proceedings of the 1999 ACM SIGMOD International Conference on Management of Data. New York: ACM, 1999: 49-60. 10.1145/304182.304187 |
8 | RODRIGUEZ A, LAIO A. Clustering by fast search and find of density peaks[J]. Science, 2014, 344(6191): 1492-1496. 10.1126/science.1242072 |
9 | LI Z J, TANG Y C. Comparative density peaks clustering[J]. Expert Systems with Applications, 2018, 95: 236-247. 10.1016/j.eswa.2017.11.020 |
10 | 陈叶旺,申莲莲,钟才明,等.密度峰值聚类算法综述[J].计算机研究与发展, 2020, 57(2): 378-394. 10.7544/issn1000-1239.2020.20190104 |
CHEN Y W, SHEN L L, ZHONG C M, et al. Survey on density peak clustering algorithm[J]. Journal of Computer Research and Development, 2020, 57(2): 378-394. 10.7544/issn1000-1239.2020.20190104 | |
11 | DU M J, DING S F, JIA H J. Study on density peaks clustering based on k-nearest neighbors and principal component analysis[J]. Knowledge-Based Systems, 2016, 99: 135-145. 10.1016/j.knosys.2016.02.001 |
12 | XIE J Y, GAO H C, XIE W X, et al. Robust clustering by detecting density peaks and assigning points based on fuzzy weighted K-nearest neighbors[J]. Information Sciences, 2016, 354: 19-40. 10.1016/j.ins.2016.03.011 |
13 | YAN H Q, WANG L, LU Y G. Identifying cluster centroids from decision graph automatically using a statistical outlier detection method[J]. Neurocomputing, 2019, 329: 348-358. 10.1016/j.neucom.2018.10.067 |
14 | LIU R, WANG H, YU X M. Shared-nearest-neighbor-based clustering by fast search and find of density peaks[J]. Information Sciences, 2018, 450: 200-226. 10.1016/j.ins.2018.03.031 |
15 | FLORES K G, GARZA S E. Density peaks clustering with gap-based automatic center detection[J]. Knowledge-Based Systems, 2020, 206: No.106350. 10.1016/j.knosys.2020.106350 |
16 | WANG Y Z, WANG D, ZHANG X F, et al. McDPC: Multi-center density peak clustering[J]. Neural Computing and Applications, 2020, 32(17): 13465-13478. 10.1007/s00521-020-04754-5 |
17 | CHEN M, LI L, WANG B, et al. Effectively clustering by finding density backbone based-on kNN[J]. Pattern Recognition, 2016, 60: 486-498. 10.1016/j.patcog.2016.04.018 |
18 | ZHU Y, TING K M, CARMAN M J. Density-ratio based clustering for discovering clusters with varying densities[J]. Pattern Recognition, 2016, 60: 983-997. 10.1016/j.patcog.2016.07.007 |
19 | LOUHICHI S, GZARA M, BEN-ABDALLAH H. Unsupervised varied density based clustering algorithm using spline[J]. Pattern Recognition Letters, 2017, 93: 48-57. 10.1016/j.patrec.2016.10.014 |
20 | AVERBUCH-ELOR H, BAR N, COHEN-OR D. Border-peeling clustering[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(7): 1791-1797. 10.1109/tpami.2019.2924953 |
21 | JIN W, TUNG A K H, HAN J W, et al. Ranking outliers using symmetric neighborhood relationship [C]// Proceedings of the 2006 Pacific-Asia Conference on Knowledge Discovery and Data Mining, LNCS 3918. Berlin: Springer, 2006: 577-593. |
22 | KARYPIS G, HAN E H, KUMAR V. Chameleon: Hierarchical clustering using dynamic modeling[J]. Computer, 1999, 32(8): 68-75. 10.1109/2.781637 |
23 | BROHÉE S, VAN HELDEN J. Evaluation of clustering algorithms for protein-protein interaction networks[J]. BMC Bioinformatics, 2006, 7: No.488. 10.1186/1471-2105-7-488 |
24 | AMIGÓ E, GONZALO J, ARTILES J, et al. A comparison of extrinsic clustering evaluation metrics based on formal constraints[J]. Information Retrieval, 2009, 12(4): 461-486. 10.1007/s10791-008-9066-8 |
25 | VINH N X, EPPS J, BAILEY J. Information theoretic measures for clusterings comparison: variants, properties, normalization and correction for chance[J]. Journal of Machine Learning Research, 2010, 11: 2837-2854. |
26 | HUBERT L, ARABIE P. Comparing partitions[J]. Journal of Classification, 1985, 2(1): 193-218. 10.1007/bf01908075 |
27 | DUA D, GRAFF C. UCI machine learning repository [DS/OL]. [2021-09-26]. . |
[1] | 郭佳, 韩李涛, 孙宪龙, 周丽娟. 自动确定聚类中心的比较密度峰值聚类算法[J]. 计算机应用, 2021, 41(3): 738-744. |
[2] | 汪敏, 武禹伯, 闵帆. 基于多种聚类算法和多元线性回归的多分类主动学习算法[J]. 计算机应用, 2020, 40(12): 3437-3444. |
[3] | 王治和, 黄梦莹, 杜辉, 秦红武. 基于密度峰值与密度聚类的集成算法[J]. 计算机应用, 2019, 39(2): 398-402. |
[4] | 刘万军, 秦济韬, 曲海成. 基于改进单类支持向量机的工业控制网络入侵检测方法[J]. 计算机应用, 2018, 38(5): 1360-1365. |
[5] | 阳旺, 何国超, 吴雁. 基于密度聚类构建物流配送问题的毁灭移除算法[J]. 计算机应用, 2017, 37(8): 2387-2394. |
[6] | 石陆魁, 张延茹, 张欣. 基于时空模式的轨迹数据聚类算法[J]. 计算机应用, 2017, 37(3): 854-859. |
[7] | 孟学潮, 叶少珍. 基于实时数据和历史查询分布的时空索引新方法[J]. 计算机应用, 2017, 37(3): 860-865. |
[8] | 黄虹玮, 葛笑天, 陈烜松. 基于复杂学习分类系统的密度聚类方法[J]. 计算机应用, 2017, 37(11): 3207-3211. |
[9] | 王冠皓 徐军. 基于群稀疏理论的乳腺动态对比度增强核磁共振图像联合重建[J]. 计算机应用, 2014, 34(11): 3304-3308. |
[10] | 冯永 韩楠 贾东风. 云计算环境下基于代表点增量层次密度聚类的微博事件检测及跟踪[J]. 计算机应用, 2013, 33(12): 3559-3562. |
[11] | 张建朋 金鑫 陈福才 陈鸿昶 候颖. 基于近邻传播的分布式数据流聚类算法[J]. 计算机应用, 2013, 33(09): 2477-2481. |
[12] | 周红芳 赵雪涵 周扬. 基于限定区域数据取样的密度聚类算法[J]. 计算机应用, 2012, 32(08): 2182-2185. |
[13] | 昝鑫 郑庆华 范宇倩 韩九强. 攻击案例综合学习系统研究[J]. 计算机应用, 2007, 27(9): 2177-2179. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||