利用坐标下降实现并行稀疏子空间聚类

doi:10.11772/j.issn.1001-9081.2016.02.0372

计算机应用 ›› 2016, Vol. 36 ›› Issue (2): 372-376.DOI: 10.11772/j.issn.1001-9081.2016.02.0372

• 第三届CCF大数据学术会议(CCF BigData 2015) • 上一篇下一篇

利用坐标下降实现并行稀疏子空间聚类

吴杰祺, 李晓宇, 袁晓彤, 刘青山

南京信息工程大学江苏省大数据分析技术重点实验室, 南京 210044

收稿日期:2015-08-29 修回日期:2015-09-17 出版日期:2016-02-10 发布日期:2016-02-03
通讯作者: 刘青山(1975-),男,安徽庐江人,教授,博士,主要研究方向:图像分析、模式识别。
作者简介:吴杰祺(1992-),男,江西宜春人,硕士研究生,主要研究方向:并行计算、机器学习;李晓宇(1991-),男,辽宁锦州人,硕士研究生,主要研究方向:大数据处理、并行计算;袁晓彤(1980-),男,江苏南通人,教授,博士,主要研究方向:稀疏学习、图模型、子空间分析
基金资助:
国家自然科学基金资助项目(61402232,61532009,61522308);江苏省自然科学基金资助项目(BK20141003,BK2012045)。

Parallel sparse subspace clustering via coordinate descent minimization

WU Jieqi, LI Xiaoyu, YUAN Xiaotong, LIU Qingshan

Jiangsu Key Laboratory of Big Data Analysis Technology, Nanjing University of Information Science & Technology, Nanjing Jiangsu 210044, China

Received:2015-08-29 Revised:2015-09-17 Online:2016-02-10 Published:2016-02-03

摘要/Abstract

摘要： 随着数据规模的不断扩大,稀疏子空间聚类问题面临计算上的巨大挑战。现有稀疏子空间聚类算法如交替方向乘子法(ADMM)往往基于串行实现,难以利用多核处理器提高处理大规模聚类问题的效率。针对这个问题,提出一种基于坐标下降的并行稀疏子空间聚类方法。该方法利用稀疏子空间聚类可以建模为求解一系列的样本自稀疏表达子问题的特点,使用坐标下降方法来求解每个子问题,具有参数少、收敛快的优点;同时结合自稀疏表达子问题独立的特点,在处理器的各个核心上同时求解不同样本对应的子问题,因此可以充分利用计算机资源,减少运行时间开销。在模拟数据和运动分割数据集Hopkins-155上与常用的ADMM算法进行对比实验,结果表明该算法在多核处理器上可以显著提升运行速度且聚类精度与ADMM相当。

关键词: 稀疏子空间聚类, 高维, 坐标下降, 并行优化, 运动分割

Abstract: Since the rapidly increasing data scale imposes a great computational challenge to the problem of Sparse Subspace Clustering (SSC), the existing optimization algorithms e.g. ADMM (Alternating Direction Method of Multipliers) for SSC are implemented in a sequential way which is unable to make use of multi-core processors to improve computational efficiency. To address this issue, a parallel SSC based on coordinate descent was proposed,inspired by a simple observation that the SSC can be formulated as a sequence of sample based sparse self-expression sub-problems. The proposed algorithm solves individual sub-problems by using a coordinate descent algorithm with fewer parameters and fast convergence. Based on the fact that the self-expression sub-problems are independent, a strategy was adopted to solve these sub-problems simultaneously on different processor cores, which brings the benefits of low computer resource consumption and fast running speed, it means that that the proposed algorithm is suitable for large scale clustering. Experiments on simulated data and Hopkins-155 motion segmentation dataset demonstrate that the proposed parallel SSC method on multi-core processors significantly improves the computational efficiency and ensures the accuracy when compared with ADMM.

Key words: Sparse Subspace Clustering(SSC), high dimensionality, coordinate descent, parallel optimization, motion segmentation

中图分类号:

TP311

吴杰祺, 李晓宇, 袁晓彤, 刘青山. 利用坐标下降实现并行稀疏子空间聚类[J]. 计算机应用, 2016, 36(2): 372-376.

WU Jieqi, LI Xiaoyu, YUAN Xiaotong, LIU Qingshan. Parallel sparse subspace clustering via coordinate descent minimization[J]. Journal of Computer Applications, 2016, 36(2): 372-376.

参考文献

[1] BELLMAN R E. Dynamic programming[M]. Princeton, NJ: Princeton University Press, 1957.
[2] ELHAMIFAR E, VIDAL R. Sparse subspace clustering: algorithm, theory, and applications[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(11): 2765-2781.
[3] LIU G, LIN Z, YAN S, et al. Robust recovery of subspace structures by low-rank representation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(1): 171-184.
[4] NI Y, SUN J, YUAN X, et al. Robust low-rank subspace segmentation with semidefinite guarantees[C]//ICDMW '10: Proceedings of the 2010 IEEE International Conference on Data Mining Workshops. Washington, DC: IEEE Computer Society, 2010: 1179-1188.
[5] COSTEIRA J P, KANADE T. A multibody factorization method for independently moving objects[J]. International Journal of Computer Vision, 1998, 29(3): 159-179.
[6] VIDAL R, MA Y, SASTRY S. Generalized principal component analysis (GPCA)[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 27(12): 1945-1959.
[7] YAN J, POLLEFEYS M. A general framework for motion segmentation: Independent, articulated, rigid, non-rigid, degenerate and non-degenerate[C]//ECCV 2006: Proceedings of the 9th European Conference on Computer Vision, LNCS 3954. Berlin: Springer-Verlag, 2006: 94-106.
[8] BRADLEY P S, MANGASARIAN O L. K-plane clustering[J]. Journal of Global Optimization, 1999, 16(1): 23-32.
[9] CHEN G, LERMAN G. Spectral Curvature Clustering (SCC)[J]. International Journal of Computer Vision, 2009, 81(3): 317-330.
[10] VIDAL R. Subspace clustering[J]. IEEE Signal Processing Magazine, 2011, 28(3): 52-68.
[11] CHENG B, YANG J, YAN S, et al. Learning with l1-graph for image analysis[J]. IEEE Transactions on Image Processing, 2010, 19(4): 858-866.
[12] LU C-Y, MIN H, ZHAO Z-Q, et al. Robust and efficient subspace segmentation via least squares regression[C]//ECCV 2012: Proceedings of the 12th European Conference on Computer Vision, LNCS 7578. Berlin: Springer-Verlag, 2012: 347-360.
[13] PATEL V M, VAN NGUYEN H, VIDAL R. Latent space sparse subspace clustering[C]//ICCV 2013: Proceedings of the 2013 IEEE International Conference on Computer Vision. Washington, DC: IEEE Computer Society, 2013: 225-232.
[14] YUAN X T, LI P. Sparse additive subspace clustering[C]//ECCV 2014: Proceedings of the 13th European Conference on Computer Vision, LNCS 8691. Berlin: Springer-Verlag, 2014: 644-659.
[15] 陈黎飞,郭躬德,姜青山.自适应的软子空间聚类算法[J].软件学报,2010,21(10):2513-2523. (CHEN L F, GUO G D, JIANG Q S. Adaptive algorithm for soft subspace clustering[J]. Journal of Software, 2010, 21(10): 2513-2523.)
[16] 王生生,刘大有,曹斌,等.一种高维空间数据的子空间聚类算法[J].计算机应用,2005,25(11):2615-2617. (WANG S S,LIU D Y,CAO B, et al.A subspace clustering algorithm for high dimensional spatial data[J].Journal of Computer Applications, 2005, 25(11): 2615-2617.)
[17] RICHTÁRIK P, TAKÁC M. Parallel coordinate descent methods for big data optimization [J/OL]. Mathematical programming, series A: full length paper, 2015, 154: 1-52. [2015-04-12]. http://link.springer.com/article/10.1007%2Fs10107-015-0901-6#.
[18] PENG Z, YAN M, YIN W. Parallel and distributed sparse optimization[C]//Proceedings of the 2013 Asilomar Conference on Signals, Systems and Computers. Piscataway, NJ: IEEE, 2013: 659-646.)
[19] BRADLEY J K, KYROLA A, BICKSON D, et al. Parallel coordinate descent for L1-regularized loss minimization[C]//ICML 2011: Proceedings of the 28th International Conference on Machine Learning. London: dblp Computer Science Bibliography, 2011: 321-328.
[20] BOYD S, PARIKH N, CHU E, et al. Distributed optimization and statistical learning via the alternating direction method of multipliers[J]. Foundations and Trends in Machine Learning, 2011, 3(1): 1-122.
[21] NG A Y, JORDAN M I, WEISS Y. On spectral clustering: Analysis and an algorithm[C]//NIPS 2001: Advances in neural information processing systems 14. Cambridge, MA: MIT Press, 2001: 849-856.
[22] FRIEDMAN J, HASTIE T, HÖFLING H, et al. Pathwise coordinate optimization[J]. The Annals of Applied Statistics, 2007, 1(2): 302-332.
[23] WU T T, LANGE K. Coordinate descent algorithms for LASSO penalized regression[J]. The Annals of Applied Statistics, 2008, 2(1): 224-244.
[24] TIBSHIRANI R. Regression shrinkage and selection via the LASSO[J]. Journal of the Royal Statistical Society, Series B (Methodological), 1996, 58(1): 267-288.

利用坐标下降实现并行稀疏子空间聚类

Parallel sparse subspace clustering via coordinate descent minimization

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

[1]	张成, 万源, 强浩鹏. 基于知识蒸馏的深度无监督离散跨模态哈希[J]. 计算机应用, 2021, 41(9): 2523-2531.
[2]	陈恒恒, 倪志伟, 朱旭辉, 金媛媛, 陈千. 基于聚类分析的差分隐私高维数据发布方法[J]. 计算机应用, 2021, 41(9): 2578-2585.
[3]	程美英, 钱乾, 倪志伟, 朱旭辉. 信息筛选多任务优化自组织迁移算法[J]. 计算机应用, 2021, 41(6): 1748-1755.
[4]	王心, 朱浩华, 刘光灿. 卷积鲁棒主成分分析[J]. 计算机应用, 2021, 41(5): 1314-1318.
[5]	乔钢柱, 王瑞, 孙超利. 基于分解的高维多目标改进进化算法[J]. 《计算机应用》唯一官方网站, 2021, 41(11): 3097-3103.
[6]	王丽娟, 陈少敏, 尹明, 许跃颖, 郝志峰, 蔡瑞初, 温雯. 基于近邻图改进的块对角子空间聚类算法[J]. 计算机应用, 2021, 41(1): 36-42.
[7]	王守华, 吴黎荣, 纪元法, 孙希延. 基于格理论的模糊度快速解算方法[J]. 计算机应用, 2020, 40(8): 2299-2304.
[8]	严华健, 张国富, 苏兆品, 刘扬. 救灾物资高维多目标自适应分配问题建模与求解[J]. 计算机应用, 2020, 40(8): 2410-2419.
[9]	顾军华, 王锋, 戚永军, 孙哲然, 田泽培, 张亚娟. 基于多尺度卷积特征融合的肺结节图像检索方法[J]. 计算机应用, 2020, 40(2): 561-565.
[10]	郝秦霞. 基于R2指标的高维多目标差分进化推荐式课程系统[J]. 计算机应用, 2020, 40(10): 2951-2959.
[11]	谭阳, 唐德权, 曹守富. 基于超球形模糊支配的高维多目标粒子群优化算法[J]. 计算机应用, 2019, 39(11): 3233-3241.
[12]	万静, 郑龙君, 何云斌, 李松. 高维不确定数据的子空间聚类算法[J]. 计算机应用, 2019, 39(11): 3280-3287.
[13]	徐佳庆, 万文, 蔡东京, 唐付桥, 何杰, 张磊. 高维胖树系统中确定性路由容错策略实现[J]. 计算机应用, 2018, 38(5): 1393-1398.
[14]	王翔, 胡学钢. 高维小样本分类问题中特征选择研究综述[J]. 计算机应用, 2017, 37(9): 2433-2438.
[15]	代照坤, 刘辉, 王文哲, 王亚楠. 基于谱特征嵌入的脑网络状态观测矩阵降维方法[J]. 计算机应用, 2017, 37(8): 2410-2415.