• •    

基于语义最优化的图像聚类算法

张凯,宋承云   

  1. 重庆理工大学
  • 收稿日期:2023-04-23 修回日期:2023-05-31 发布日期:2023-12-04
  • 通讯作者: 宋承云

Semantic optimization based image clustering algorithm

  • Received:2023-04-23 Revised:2023-05-31 Online:2023-12-04
  • Contact: Cheng-yun yunSong

摘要: 摘 要: 数据视图之间的对比学习已经在深度聚类领域取得了显著的成功。然而,由于一个视图的所有监督信息都源自于另一个视图,对比学习只能得到二者共享信息的最小充分表示,并且会弱化视图之间的非共享信息。考虑到图像聚类所需要的特征表示的丰富性,不能保证所有与聚类任务相关的信息都被包含在视图之间的共享信息中,即对比学习所得到的最小充分表示对于聚类任务是不足的。因此,提出了一种优化语义特征的方法,该方法在预训练阶段采用重构损失作为正则化项,增加特征表示和输入之间的互信息,从而近似引入更多与聚类任务相关的信息,降低对比学习过拟合共享信息的风险。在微调阶段,抛弃传统的聚类算法与聚类网络同时更新的方式,采用图片近邻之间的相似性差异作为损失来更新聚类网络,以最大程度的利用图片之间的近邻语义信息。通过在CIFAR10,CIFAR100,STL10数据集上对该方法进行实验,实验结果表明,该方法的准确率高于所有对比的方法,在STL10数据集上比次优的SCAN方法提高了2.7%,并且其他评价指标均取得了领先,验证了该方法的有效性。

关键词: 关键词: 深度聚类, 对比学习, 语义特征, 过拟合, 正则化

Abstract: Abstract: Contrastive learning between data views has achieved remarkable success in the field of deep clustering. However, since all supervised information of one view originates from the other view, contrastive learning yields only a minimal adequate representation of the information shared by both and weakens the non-shared information between views. Considering the richness of feature representations required for image clustering, it cannot be guaranteed that all information relevant to the clustering task is included in the shared information between views, i.e., the minimum adequate representation obtained by contrastive learning is insufficient for the clustering task. Therefore, a method for optimizing semantic features is proposed, which uses reconstruction loss as a regularization term in the pre-training phase to increase the mutual information between the feature representation and the input, thus approximating the introduction of more information relevant to the clustering task and reducing the risk of overfitting shared information by contrastive learning. In the fine-tuning stage, the traditional clustering algorithm and clustering network are abandoned to be updated simultaneously, and the similarity differences between the picture nearest neighbors are used as losses to update the clustering network in order to maximize the use of the semantic information of the nearest neighbors between pictures. By experimenting the method on CIFAR10,CIFAR100,STL10 datasets, the experimental results show that the accuracy of the method is higher than all the compared methods, and it improves 2.7% over the suboptimal SCAN method on STL10 dataset, and the other evaluation indexes achieve the leading position, which verifies the effectiveness of the method.

Key words: Keywords: deep clustering, contrastive learning, semantic features, overfitting, regularization