《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (4): 1148-1154.DOI: 10.11772/j.issn.1001-9081.2021071271

• CCF第36届中国计算机应用大会 (CCF NCCA 2021) • 上一篇    

基于随机分块的稀疏子空间聚类方法

张琦1, 郑伯川2(), 张征1, 周欢欢1   

  1. 1.西华师范大学 数学与信息学院,四川 南充 637009
    2.西华师范大学 计算机学院,四川 南充 637009
  • 收稿日期:2021-07-16 修回日期:2021-08-23 接受日期:2021-08-27 发布日期:2022-04-15 出版日期:2022-04-10
  • 通讯作者: 郑伯川
  • 作者简介:张琦(1996—),女,重庆人,硕士研究生,CCF会员,主要研究方向:机器学习、聚类分析
    张征(1978—),女,四川自贡人,副教授,硕士,主要研究方向:运筹与优化
    周欢欢(1996—),女,重庆垫江人,硕士研究生,主要研究方向:机器学习、聚类分析。
  • 基金资助:
    国家自然科学基金资助项目(62176217);四川省科技创新苗子工程项目(2020029)

Sparse subspace clustering method based on random blocking

Qi ZHANG1, Bochuan ZHENG2(), Zheng ZHANG1, Huanhuan ZHOU1   

  1. 1.School of Mathematics and Information,China West Normal University,Nanchong Sichuan 637009,China
    2.School of Computer Science,China West Normal University,Nanchong Sichuan 637009,China
  • Received:2021-07-16 Revised:2021-08-23 Accepted:2021-08-27 Online:2022-04-15 Published:2022-04-10
  • Contact: Bochuan ZHENG
  • About author:ZHANG Qi, born in 1996, M. S. candidate. Her research interests include machine learning, clustering analysis.
    ZHANG Zheng, born in 1978, M. S., associate professor. Her research interests include operations research and optimization.
    ZHOU Huanhuan, born in 1996, M. S. candidate. Her research interests include machine learning, clustering analysis.
    First author contact:ZHENG Bochuan, born in 1974, Ph. D., professor. His research interests include machine learning, deep learning, computer vision.
  • Supported by:
    National Natural Science Foundation of China(62176217);Program of Sichuan Science and Technology Innovation Seedling Project(2020029)

摘要:

针对稀疏子空间聚类(SSC)方法聚类误差大的问题,提出了基于随机分块的SSC方法。首先,将原问题数据集随机分成几个子集,构建几个子问题;然后,采用交替方向乘子法(ADMM)分别求得几个子问题的系数矩阵,之后将几个系数矩阵扩充成与原问题一样大小的系数矩阵,并整合成一个系数矩阵;最后,根据整合得到的系数矩阵计算得到一个相似矩阵,并采用谱聚类(SC)算法获得原问题的聚类结果。相较于稀疏子空间聚类(SSC)、随机稀疏子空间聚类(S3COMP-C)、基于正交匹配追踪的稀疏子空间聚类(SSCOMP)、谱聚类(SC)和K均值(K-Means)算法中的最优算法,基于随机分块的SSC方法将子空间聚类误差平均降低了3.12个百分点,且其互信息、兰德指数和熵3个性能指标都明显优于对比算法。实验结果表明基于随机分块的SSC方法能降低子空间聚类误差,改善聚类性能。

关键词: 自表达, 随机分块, 谱聚类, 人脸聚类, 稀疏子空间

Abstract:

Aiming at the problem of big clustering error of the Sparse Subspace Clustering (SSC) methods, an SSC method based on random blocking was proposed. First, the original problem dataset was divided into several subsets randomly to construct several sub-problems. Then, after obtaining the coefficient matrices of several sub-problems by the sparse subspace Alternating Direction Method of Multipliers (ADMM) respectively, these coefficient matrices were expanded into coefficient matrices of the same size as the original problem and integrated into a coefficient matrix. Finally, a similarity matrix was calculated according to the coefficient matrix obtained by the integration, and the clustering result of the original problem was obtained by using the Spectral Clustering (SC) algorithm. The SSC method based on random blocking has the subspace clustering error reduced by 3.12 percentage points on average compared with the optional algorithm among SSC, Stochastic Sparse Subspace Clustering via Orthogonal Matching Pursuit with Consensus (S3COMP-C), scalable Sparse Subspace Clustering by Orthogonal Matching Pursuit (SSCOMP), SC and K-Means algorithms, and has all the mutual information, Rand index and entropy significantly better than comparison algorithms. Experimental results show that the SSC method based on random blocking can significantly reduce subspace clustering error, and improve the clustering performance.

Key words: self-expression, random blocking, Spectral Clustering (SC), face clustering, sparse subspace

中图分类号: