Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (2): 407-412.DOI: 10.11772/j.issn.1001-9081.2021122126

• Data science and technology • Previous Articles    

Diversity represented deep subspace clustering algorithm

Zhifeng MA1,2, Junyang YU1,2, Longge WANG1,2()   

  1. 1.College of Software,Henan University,Kaifeng Henan 475004,China
    2.Henan Province Intelligent Data Processing Engineering Research Center (Henan University),Kaifeng Henan 475004,China
  • Received:2021-12-21 Revised:2022-07-04 Accepted:2022-07-15 Online:2022-09-23 Published:2023-02-10
  • Contact: Longge WANG
  • About author:MA Zhifeng, born in 1998, M. S. candidate. His research interests include image processing, deep learning.
    YU Junyang, born in 1982, Ph. D., associate professor. His research interests include distributed computing, artificial intelligence.
  • Supported by:
    Key Scientific and Technological Project of Henan Province(212102210078)

多样性表示的深度子空间聚类算法

马志峰1,2, 于俊洋1,2, 王龙葛1,2()   

  1. 1.河南大学 软件学院,河南 开封 475004
    2.河南省智能数据处理工程研究中心(河南大学),河南 开封 475004
  • 通讯作者: 王龙葛
  • 作者简介:马志峰(1998—),男,河南濮阳人,硕士研究生,主要研究方向:图像处理、深度学习
    于俊洋(1982—),男,河南淮阳人,副教授,博士,CCF会员,主要研究方向:分布式计算、人工智能;
  • 基金资助:
    河南省科技攻关项目(212102210078)

Abstract:

Focusing on the challenge task for mining complementary information in different levels of features in the deep subspace clustering problem, based on the deep autoencoder, by exploring complementary information between the low-level and high-level features obtained by the encoder, a Diversity Represented Deep Subspace Clustering (DRDSC) algorithm was proposed. Firstly, based on Hilbert-Schmidt Independence Criterion (HSIC), a diversity representation measurement model was established for different levels of features. Secondly, a feature diversity representation module was introduced into the deep autoencoder network structure, which explored image features beneficial to enhance the clustering effect. Furthermore, the form of loss function was updated to effectively fuse the underlying subspaces of multi-level representation. Finally, several experiments were conducted on commonly used clustering datasets. Experimental results show that on the datasets Extended Yale B, ORL, COIL20 and Umist, the clustering error rates of DRDSC reach 1.23%, 10.50%, 1.74% and 17.71%, respectively, which are reduced by 10.41, 16.75, 13.12 and 12.92 percentage points, respectively compared with those of Efficient Dense Subspace Clustering (EDSC), and are reduced by 1.44, 3.50, 3.68 and 9.17 percentage points, respectively compared with Deep Subspace Clustering (DSC), which indicates that the proposed DRDSC algorithm has better clustering effect.

Key words: Hilbert-Schmidt Independence Criterion (HSIC), autoencoder, similarity matrix, spectral clustering, subspace clustering

摘要:

针对深度子空间聚类问题中不同层次特征中互补信息挖掘困难的问题,在深度自编码器的基础上,提出了一种在编码器获取的低层和高层特征之间探索互补信息的多样性表示的深度子空间聚类(DRDSC)算法。首先,基于希尔伯特-施密特独立性准则(HSIC)建立了不同层次特征衡量多样性表示模型;其次,在深度自编码器网络结构中引入特征多样性表示模块,从而挖掘有利于提升聚类效果的图像特征;此外,更新了损失函数的形式,有效融合了多层次表示的底层子空间;最后,在常用的聚类数据集上进行了多次实验。实验结果表明,DRDSC在数据集Extended Yale B、ORL、COIL20和Umist上的聚类错误率分别达到1.23%、10.50%、1.74%和17.71%,与高效稠密子空间聚类(EDSC)相比,分别降低了10.41、16.75、13.12和12.92个百分点;与深度子空间聚类(DSC)相比,分别降低了1.44、3.50、3.68和9.17个百分点,说明所提出的DRDSC算法有更好的聚类效果。

关键词: 希尔伯特-施密特独立性准则, 自编码器, 相似度矩阵, 谱聚类, 子空间聚类

CLC Number: