《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (8): 2338-2344.DOI: 10.11772/j.issn.1001-9081.2022091337

• 第十九届CCF中国信息系统及应用大会 • 上一篇    

双路自编码器的属性网络表示学习

王静红1,2,3, 周志霞1, 王辉4(), 李昊康4   

  1. 1.河北师范大学 计算机与网络空间安全学院,石家庄 050024
    2.河北省网络与信息安全重点实验室(河北师范大学),石家庄 050024
    3.供应链大数据分析与数据安全河北省工程研究中心(河北师范大学),石家庄 050024
    4.河北工程技术学院,石家庄 050091
  • 收稿日期:2022-09-06 修回日期:2022-09-27 接受日期:2022-10-08 发布日期:2022-10-13 出版日期:2023-08-10
  • 通讯作者: 王辉
  • 作者简介:王静红(1967—),女,河北石家庄人,教授,博士,CCF会员,主要研究方向:人工智能、大数据、数据挖掘
    周志霞(1996—),女,河北石家庄人,硕士研究生,CCF会员,主要研究方向:数据挖掘、网络表示学习
    李昊康(1994—),男,河北石家庄人,硕士,CCF会员,主要研究方向:社区发现、深度学习、图表示学习。
  • 基金资助:
    中央引导地方科技发展资金资助项目(226Z1808G);河北省自然科学基金资助项目(F2021205014);河北省高等学校科学技术研究项目(ZD2022139);河北师范大学重点项目(L2023J05)

Attribute network representation learning with dual auto-encoder

Jinghong WANG1,2,3, Zhixia ZHOU1, Hui WANG4(), Haokang LI4   

  1. 1.College of Computer and Cyber Security,Hebei Normal University,Shijiazhuang Hebei 050024,China
    2.Hebei Provincial Key Laboratory of Network and Information Security (Hebei Normal University),Shijiazhuang Hebei 050024,China
    3.Hebei Provincial Engineering Research Center for Supply Chain Big Data Analytics and Security (Hebei Normal University),Shijiazhuang Hebei 050024,China
    4.Hebei Polytechnic Institute,Shijiazhuang Hebei 050091,China
  • Received:2022-09-06 Revised:2022-09-27 Accepted:2022-10-08 Online:2022-10-13 Published:2023-08-10
  • Contact: Hui WANG
  • About author:WANG Jinghong, born in 1967, Ph. D., professor. Her research interests include artificial intelligence, big data, data mining.
    ZHOU Zhixia, born in 1996, M. S. candidate. Her research interests include data mining, network representation learning.
    LI Haokang, born in 1994, M. S.His research interests include community discovery, deep learning, graph representation learning.
  • Supported by:
    Central Guidance on Local Science and Technology Development Fund of Hebei Province(226Z1808G);Hebei Natural Science Foundation(F2021205014);Science and Technology Project of Hebei Colleges and Universities(ZD2022139);Hebei Normal University Science and Technology Major Project(L2023J05)

摘要:

属性网络表示学习的目的是在保证网络中节点性质的前提下,结合结构和属性信息学习节点的低维稠密向量表示。目前属性网络表示学习方法忽略了网络中属性信息的学习,且这些方法中的属性信息与网络拓扑结构的交互性不足,不能高效融合网络结构和属性信息。针对以上问题,提出一种双路自编码器的属性网络表示学习(DENRL)算法。首先,通过多跳注意力机制捕获节点的高阶邻域信息;其次,设计低通拉普拉斯滤波器去除高频信号,并迭代获取重要邻居节点的属性信息;最后,构建自适应融合模块,通过结构和属性信息的一致性及差异性约束来增加对重要信息的获取,并通过监督两个自编码器的联合重构损失函数训练编码器。在Cora、Citeseer、Pubmed和Wiki数据集上的实验结果表明,与DeepWalk、ANRL(Attributed Network Representation Learning)等算法相比,DENRL算法在3个引文网络数据集上聚类准确率最高、算法运行时间最少,在Cora数据集上聚类准确率为0.775和运行时间为0.460 2 s;且DENRL算法在Cora和Citeseer数据集上链路预测精确率最高,分别达到了0.961和0.970。可见,属性与结构信息的融合及交互学习可以获得更强的节点表示能力。

关键词: 属性网络, 网络表示学习, 自编码器, 交互学习, 注意力机制

Abstract:

On the premise of ensuring the properties of nodes in the network, the purpose of attribute network representation learning is to learn the low-dimensional dense vector representation of nodes by combining structure and attribute information. In the existing attribute network representation learning methods, the learning of attribute information in the network is ignored, and the interaction of attribute information with the network topology is insufficient, so that the network structure and attribute information cannot be fused efficiently. In response to the above problems, a Dual auto-Encoder Network Representation Learning (DENRL) algorithm was proposed. Firstly, the high-order neighborhood information of nodes was captured through a multi-hop attention mechanism. Secondly, a low-pass Laplacian filter was designed to remove the high-frequency signals and iteratively obtain the attribute information of important neighbor nodes. Finally, an adaptive fusion module was constructed to increase the acquisition of important information through the consistency and difference constraints of the two kinds of information, and the encoder was trained by supervising the joint reconstruction loss function of the two auto-encoders. Experimental results on Cora, Citeseer, Pubmed and Wiki datasets show that DENRL algorithm has the highest clustering accuracy and the lowest algorithm running time on three citation network datasets compared with DeepWalk, ANRL (Attributed Network Representation Learning) and other algorithms, achieves these two indicators of 0.775 and 0.460 2 s respectively on Cora datasets, and has the highest link prediction precision on Cora and Citeseer datasets, reaching 0.961 and 0.970 respectively. It can be seen that the fusion and interactive learning of attribute and structure information can obtain stronger node representation capability.

Key words: attribute network, network representation learning, auto-encoder, interactive learning, attention mechanism

中图分类号: