《计算机应用》唯一官方网站 ›› 2021, Vol. 41 ›› Issue (12): 3447-3454.DOI: 10.11772/j.issn.1001-9081.2021061129

• 第十八届中国机器学习会议(CCML 2021) • 上一篇    

基于核非负矩阵分解的有向图聚类算法

陈献1,2, 胡丽莹1,2(), 林晓炜1,2, 陈黎飞1,2,3   

  1. 1.福建师范大学 计算机与网络空间安全学院,福州 350117
    2.数字福建环境监测物联网实验室(福建师范大学),福州 350117
    3.福建省应用数学中心(福建师范大学),福州 350117
  • 收稿日期:2021-05-12 修回日期:2021-07-02 接受日期:2021-07-26 发布日期:2021-12-28 出版日期:2021-12-10
  • 通讯作者: 胡丽莹
  • 作者简介:陈献(1997—),男,福建莆田人,硕士研究生,主要研究方向:社区发现、人脸识别
    林晓炜(1992—),男,福建龙海人,硕士研究生,主要研究方向:机器学习、复杂网络
    陈黎飞(1972—),男,福建长乐人,教授,博士,CCF会员,主要研究方向:统计机器学习、数据挖掘、模式识别。
  • 基金资助:
    国家自然科学基金资助项目(U1805263);福建省自然科学基金资助项目(2018J01775)

Directed graph clustering algorithm based on kernel nonnegative matrix factorization

Xian CHEN1,2, Liying HU1,2(), Xiaowei LIN1,2, Lifei CHEN1,2,3   

  1. 1.College of Computer and Cyber Security,Fujian Normal University,Fuzhou Fujian 350117,China
    2.Digit Fujian Internet-of-Things Laboratory of Environmental Monitoring (Fujian Normal University),Fuzhou Fujian 350117,China
    3.Center for Applied Mathematics of Fujian Province (Fujian Normal University),Fuzhou Fujian 350117,China
  • Received:2021-05-12 Revised:2021-07-02 Accepted:2021-07-26 Online:2021-12-28 Published:2021-12-10
  • Contact: Liying HU
  • About author:CHEN Xian, born in 1997, M. S. candidate. His research interests include community detection, face recognition.
    LIN Xiaowei, born in 1992, M. S. candidate. His research interests include machine learning, complex network.
    CHEN Lifei, born in 1972, Ph. D., professor. His research interests include statistical machine learning, data mining, pattern recognition.
  • Supported by:
    the National Natural Science Foundation of China(U1805263);the Natural Science Foundation of Fujian Province(2018J01775)

摘要:

现有的有向图聚类算法大多基于向量空间中节点间的近似线性关系假设,忽略了节点间存在的非线性相关性。针对该问题,提出一种基于核非负矩阵分解(KNMF)的有向图聚类算法。首先,引入核学习方法将有向图的邻接矩阵投影到核空间,并通过特定的正则项约束原空间及核空间中节点间的相似性。其次,提出了图正则化核非对称NMF算法的目标函数,并在非负约束条件下通过梯度下降方法推导出一个聚类算法。该算法在考虑节点连边的方向性的同时利用核学习方法建模节点间的非线性关系,从而准确地揭示有向图中潜在的结构信息。最后,在专利-引文网络(PCN)数据集上的实验结果表明,簇的数目为2时,和对比算法相比,所提算法将DB值和DQF值分别提高了约0.25和8%,取得了更好的聚类质量。

关键词: 有向图聚类, 核非负矩阵分解, 核学习方法, 正则化, 节点相似性

Abstract:

Most of the existing directed graph clustering algorithms are based on the assumption of approximate linear relationship between nodes in vector space, ignoring the existence of non-linear correlation between nodes. To address this problem, a directed graph clustering algorithm based on Kernel Nonnegative Matrix Factorization (KNMF) was proposed. First, the adjacency matrix of a directed graph was projected to the kernel space by using a kernel learning method, and the node similarity in both the original and kernel spaces was constrained by a specific regularization term. Second, the objective function of graph regularization kernel asymmetric NMF algorithm was proposed, and a clustering algorithm was derived by gradient descent method under the non-negative constraints. The algorithm accurately reveals the potential structural information in the directed graph by modeling the non-linear relationship between nodes using kernel learning method, as well as considering the directivity of the links of nodes. Finally, experimental results on the Patent Citation Network (PCN) dataset show that compared with the comparison algorithm, when the number of clusters is 2, the proposed algorithm improves the Davies-Bouldin (DB) and Distance-based Quality Function (DQF) by about 0.25 and 8% respectively, achieving better clustering quality.

Key words: directed graph clustering, Kernel Nonnegative Matrix Factorization (KNMF), kernel learning method, regularization, node similarity

中图分类号: