Journal of Computer Applications ›› 2025, Vol. 45 ›› Issue (1): 115-126.DOI: 10.11772/j.issn.1001-9081.2023121724

• Data science and technology • Previous Articles     Next Articles

Self-adaptive multi-view clustering algorithm with complementarity based on weighted anchors

Zhuoyue OU, Xiuqin DENG(), Lei CHEN   

  1. School of Mathematics and Statistics,Guangdong University of Technology,Guangzhou Guangdong 510520,China
  • Received:2023-12-13 Revised:2024-03-25 Accepted:2024-03-27 Online:2024-04-28 Published:2025-01-10
  • Contact: Xiuqin DENG
  • About author:OU Zhuoyue, born in 2000, M. S. candidate. His research interests include machine learning, data mining.
    CHEN Lei, born in 1988, Ph. D., lecturer. His research interests include machine learning, intelligent optimization algorithms.
  • Supported by:
    China Youth Fund Project of National Natural Science Foundation(62006044);Science and Technology Program of Guangzhou(202201010377)

基于加权锚点的自适应多视图互补聚类算法

区卓越, 邓秀勤(), 陈磊   

  1. 广东工业大学 数学与统计学院,广州 510520
  • 通讯作者: 邓秀勤
  • 作者简介:区卓越(2000—),男,广东广州人,硕士研究生,主要研究方向:机器学习、数据挖掘;
    陈磊(1988—),男,河南信阳人,讲师,博士,主要研究方向:机器学习、智能优化算法。
  • 基金资助:
    国家自然科学基金青年基金资助项目(62006044);广州市科技计划项目(202201010377)

Abstract:

In multi-view clustering problems, how to fully mine the correlation information among views while reducing the influence of redundant information on clustering performance is an urgent problem that needs to be solved. But the existing related algorithms ignore the complementary information and differences among views,and do not consider the interference brought by redundant information, resulting in poor clustering performance. To address these issues, a Self-adaptive Multi-view clustering algorithm with Complementarity based on Weighted Anchors (SMCWA) was proposed. When dealing with the challenges of high-dimensional multi-view data, firstly, feature concatenation was transferred to anchor mechanism, so as to fuse the anchor graphs to utilize the complementary information among views. Secondly, to weaken the expression of redundant information, the weight of each anchor was determined dynamically through a weighted matrix during the iteration process. Finally, to utilize the differences among views, an auto-weighted mechanism was used to assign appropriate weight to each view adaptively. The complementarity among views, the weakening of redundant information, and the differences among views promoted and learned from each other in multi-step iterations in an integrated algorithm to obtain better clustering effect. Experimental results show that the proposed algorithm improves Matthews Correlation Coefficient (MCC) by 41.75% on dataset BDGP (Berkeley Drosophila Genome Project) compared to spectral clustering algorithm SC-Concat, improves MCC by 11.83% on dataset CCV (Columbia Consumer Video) compared to Large-scale Multi-View Subspace Clustering in linear time (LMVSC) algorithm, and improves MCC by 19.57% on dataset Caltech101-all compared to the spectral clustering algorithm SC-Best, demonstrating that the proposed algorithm makes full consideration of the complementary information, the differences among views and the redundant information to obtain better clustering performance.

Key words: auto-weighted mechanism, complementarity, anchor mechanism, subspace clustering, multi-view clustering

摘要:

在多视图聚类问题中,充分挖掘各视图间的关联信息,并降低冗余信息对聚类效果的影响是当前亟须解决的问题,但现有算法会忽略各视图间的互补性及差异性,或没有考虑冗余信息带来的干扰,从而导致聚类效果不佳。针对这些局限性,提出一种基于加权锚点的自适应多视图互补聚类算法(SMCWA)。在应对高维多视图数据的挑战时,首先,将特征直连迁移至锚点机制,从而融合各锚图来利用视图间的互补性信息;其次,在迭代过程中,使用加权矩阵动态确定各锚点的权重,从而弱化冗余信息的表达;最后,使用自动权重机制为各视图自适应地分配适当的权重,以利用视图间的差异性。将上述优化步骤整合至同一算法中,使视图互补性、冗余信息的弱化以及视图差异性在多步迭代中相互促进、相互学习,进而提高聚类效果。实验结果表明,在BDGP (Berkeley Drosophila Genome Project)数据集上,SMCWA在马修斯相关系数(MCC)上较谱聚类算法SC-Concat提升了41.75%;在CCV (Columbia Consumer Video)数据集上,SMCWA在MCC上较大规模线性时间多视图子空间聚类(LMVSC)算法提升了11.83%;在Caltech101-all数据集上,SMCWA在MCC上较谱聚类算法SC-Best提升了19.57%,说明该算法可充分考虑视图间的互补性信息、视图间的差异和冗余信息来提高聚类效果。

关键词: 自动权重机制, 互补性, 锚点机制, 子空间聚类, 多视图聚类

CLC Number: