Journal of Computer Applications
Next Articles
Received:
Revised:
Online:
Published:
杨成昊1,胡节1,王红军2,彭博3
通讯作者:
基金资助:
Abstract: Abstract: In order to solve the problems of uncertainty in completing missing view data, lack of robustness of embedding learning and low model generalization in traditional deep incomplete multi-view clustering algorithms, an Incomplete Multi-View Clustering algorithm based on Attention Mechanism (IMVCAM) was proposed. First, K-Nearest Neighbors (KNN) was used to complete the missing data in the view, making the training data complementary. Then, after passing the linear encoding layer, the obtained embedding was passed through the attention layer to improve the quality of the embedding. Finally, the embedding obtained from the training of each view was clustered using the k-means clustering algorithm (k-means), and the weights of the views were determined by the Pearson correlation coefficient. The experiments were conducted on five classic datasets, and the best results were achieved on the Fashion dataset. Experimental results on the Fashion dataset showed that compared with the suboptimal DSIMVC (Deep Safe Incomplete Multi-View Clustering), the proposed algorithm IMVCAM improved the clustering accuracy by 2.85 and 4.35 percentage points when the data missing rate was 0.1 and 0.3 respectively. In addition, on the Caltech101-20 dataset, the clustering accuracy increased by 7.68 and 3.48 percentage points compared to the suboptimal IMVCSAF (Incomplete Multi-View Clustering algorithm based on Self-Attention Fusion) when the missing rate was 0.1 and 0.3.
Key words: Keywords: Incomplete multi-view clustering, Attention mechanism, K-Nearest Neighbors (KNN), k-means clustering algorithm (k-means), Pearson correlation coefficient
摘要: 摘 要: 针对传统深度不完备多视图聚类算法中补全缺失视图数据的不确定性,嵌入学习缺乏鲁棒性以及模型泛化性低的问题,提出了基于注意力机制的不完备多视图聚类算法(IMVCAM)。首先,通过K最近邻(KNN)补全了视图中缺失的数据,使得训练数据具有互补性;然后,经过线性编码层,再将获得的嵌入通过注意力层,提高嵌入的质量;最后,对每个视图训练得到的嵌入使用k均值聚类算法(k-means),视图的权重通过皮尔逊相关系数进行确定。实验在五个经典的数据集上进行,在Fashion数据集上取得最优的结果。在Fashion数据集上的实验结果表明,所提算法IMVCAM相较于次优的DSIMVC(Deep Safe Incomplete Multi-View Clustering)在数据缺失率为0.1,0.3的情况下聚类精度提升了2.85,4.35个百分点。此外,在Caltech101-20数据集上,缺失率为0.1,0.3的情况下相比于次优的IMVCSAF(Incomplete Multi-View Clustering algorithm based on Self-Attention Fusion)聚类精度提升了7.68,3.48个百分点。
关键词: 关键词: 不完备多视图聚类, K最近邻, 注意力机制, 皮尔逊相关系数, k均值聚类算法
CLC Number:
TP391
杨成昊 胡节 王红军 彭博. 基于注意力机制的不完备多视图聚类算法[J]. 《计算机应用》唯一官方网站, DOI: 10.11772/j.issn.1001-9081.2023121866.
0 / Recommend
Add to citation manager EndNote|Ris|BibTeX
URL: http://www.joca.cn/EN/10.11772/j.issn.1001-9081.2023121866