Abstract:In order to solve the problem that the construction of the similarity matrix in the spectral clustering algorithm cannot meet the higher similarity of the data points within the cluster, a Multi-View spectral clustering algorithm based on Shared Nearest Neighbor (MV-SNN) was given. Firstly, the similarity between two data points with a large number of shared neighbors was increased, making the similarity between the data points in the same cluster higher. Then, the improved similarity matrices of multiple views were integrated to obtain a global similarity matrix. Finally, considering that the general spectral clustering methods still need k-means clustering algorithm to divide the data points at the later stage, a rank constraint method of Laplacian matrix was proposed to directly obtain the final cluster structure through the global similarity matrix. Experimental results show that compared with other multi-view spectral algorithms, MV-SNN algorithm has the three measurement standards of clustering:accuracy, purity and normalized mutual information improved by 1%-20%, and the clustering time reduced by about 50%. It can be seen that MV-SNN algorithm can improve the clustering performance and reduce the clustering time.
[1] 章永来. 周耀鉴. 聚类算法综述[J]. 计算机机应用,2019,39(7):1869-1882. (ZHANG Y L,ZHOU Y J. Overview of clustering algorithms[J]. Journal of Computer Applications,2019,39(7):1869-1882.) [2] 杨俊闯, 赵超. K-Means聚类算法研究综述[J]. 计算机工程与应用,2019,55(23):7-14. (YANG J C,ZHAO C. Survey on K-Means clustering algorithm[J]. Computer Engineering and Applications,2019,55(23):7-14.) [3] 蔡晓妍, 戴冠中, 杨黎斌. 谱聚类算法综述[J]. 计算机科学, 2008,35(7):14-18. (CAI X Y,DAI G Z,YANG L B. Survey of spectral clustering algorithms[J]. Computer Science,2008,35(7):14-18.) [4] ZHANG T,MA F. Improved rough k-means clustering algorithm based on weighted distance measure with Gaussian function[J]. International Journal of Computer Mathematics,2017,94(4):663-675. [5] HU L Y,HUANG M W,KE S W,et al. The distance function effect on k-nearest neighbor classification for medical datasets[J]. SpringerPlus,2016,5:No. 1304. [6] 唐永强, 张文生. 基于自步学习的鲁棒多样性多视角聚类[J]. 中国图象图形学报,2019,24(8):1338-1348. (TANG Y Q, ZHANG W S. Robust and diverse multi-view clustering based on self-paced learning[J]. Journal of Image and Graphics,2019,24(8):1338-1348.) [7] 贺艳芳, 邵亚丽, 向志华. 基于谱聚类的多视角聚类算法[J]. 河南教育学院学报(自然科学版),2018,27(1):15-18. (HE Y F, SHAO Y L,XIANG Z H. Multi-view clustering algorithm based on spectral clustering[J]. Journal of Henan Institute of Education (Natural Science Edition),2018,27(1):15-18.) [8] WANG Y,WU L,LIN X,et al. Multiview spectral clustering via structured low-rank matrix factorization[J]. IEEE Transactions on Neural Networks and Learning Systems,2018, 29(10):4833-4843. [9] KUMAR A,RAI P,DAUMÉ H Ⅲ. Co-regularized multi-view spectral clustering[C]//Proceedings of the 24th International Conference on Neural Information Processing Systems. Red Hook, NY:Curran Associates Inc.,2011:1413-1421. [10] ZHAN K, ZHANG C, GUAN J, et al. Graph learning for multiview clustering[J]. IEEE Transactions on Cybernetics, 2018,48(10):2887-2895. [11] NIE F, WANG X, JORDAN M I, et al. The constrained Laplacian rank algorithm for graph-based clustering[C]//Proceedings of the 13th AAAI Conference on Artificial Intelligence. Palo Alto:AAAI Press,2016:1969-1976. [12] BOYD S, VANDENBERGHE L. Convex Optimization[M]. Cambridge:Cambridge University Press,2004:78. [13] CVTKOVIĆ D,ROWLINSON P,SIMIĆ S. An Introduction to The Theory of Graph Spectra[M]. Cambridge:Cambridge University Press,2009:162-183. [14] LI Y,NIE F,HUNAG H,et al. Large-scale multi-view spectral clustering via bipartite graph[C]//Proceedings of the 29th AAAI Conference on Artificial Intelligence. Palo Alto:AAAI Press, 2015:2750-2756. [15] ZHAN K,NIU C,CHEN C,et al. Graph structure fusion for multiview clustering[J]. IEEE Transactions on Knowledge and Data Engineering,2019,31(10):1984-1993. [16] ZHAN K,NIE F P,WANG J,et al. Multiview consensus graph clustering[J]. IEEE Transactions on Image Processing,2019,28(3):1261-1270.