基于节点-属性二部图的网络表示学习模型

• •

基于节点-属性二部图的网络表示学习模型

周乐¹,代婷婷²,李淳³,谢军³,楚博策⁴,李峰⁴,张君毅³,刘峤⁵

1. 电子科技大学信息与软件工程学院
2. 四川省成都市成华区建设北路二段4号
3. 河北省电磁频谱认知与管控重点实验室
4. 中国电子科技集团公司航天信息应用技术重点实验室
5. 电子科技大学

收稿日期:2021-06-07 修回日期:2021-08-17 发布日期:2021-08-17
通讯作者: 周乐

Network Embedding based on Node Attribute Bipartite Graph

Received:2021-06-07 Revised:2021-08-17 Online:2021-08-17
Contact: Le ZHOU

摘要/Abstract

摘要： 摘要: 在图结构数据上开展推理计算是一项重大且普遍存在的任务，该任务的主要挑战是如何表示网状知识使机器可以快速理解并利用图数据。通过对比发现，当前基于随机游走方法的表示学习模型容易忽略属性对节点关联的特殊作用。据此提出一种基于节点邻接关系与属性关联关系的混合随机游走方法，其基本思想是首先通过邻接节点间的共同属性分布计算属性权重，获取节点到每个属性的采样概率，然后分别从邻接节点与含有共有属性的非邻接节点中提取网络信息。最后构建了基于节点-属性二部图的网络表示学习模型，通过上述采样序列学习得到节点向量表达。在Flickr、BlogCatalog、Cora公开数据集上，用该模型得到的节点向量表达进行节点标签分类的平均准确率为89.07%，比近期工作高出了2.13个百分点，比经典工作高出了21.34个百分点，且通过对比不同随机游走方法发现，提高对节点关联有促进作用的属性的采样概率，可以提高采样序列所含信息。

关键词: 关键词: 网络嵌入, 表示学习, 随机游走, 网络采样, 属性网络, 节点分类

Abstract: Abstract: It is an important task to carry out inference calculation on graph structure data. The main challenge of this task is how to represent network knowledge so that machines can easily understand and use graph structure data. After comparing with the existing representation learning models, it is found that the model based on random walk methods are likely to ignore the special effect of attributes on the adjacency relationship between nodes. There for, a hybrid random walk method based on node adjacency and attributes association was proposed. The basic idea is to calculate the attribute weight through the common attribute distribution among adjacent nodes, and obtain the sampling probability from node to each attribute. Then the network information can be extracted from adjacent nodes and non-adjacent nodes with common attributes. Finally, the network representation learning model based on node attribute bipartite graph was constructed to learn nodes’ embeddings through the above sampling sequence. Experimental results on Flickr, BlogCatalog and Cora show that the average accuracy of node classification by the model is 89.07%，which was 2.13 percentage points higher than recent work and 21.34 percentage points higher than classical work. By comparing different random walk methods, it is also found that increasing the sampling probability of attributes that promote node association can improve the information contained in the sampling sequence.

Key words: Keywords: network embedding, representation learning, random walk, network sampling, attributed network, node classification

中图分类号:

TP391
TP18

周乐代婷婷李淳谢军楚博策李峰张君毅刘峤. 基于节点-属性二部图的网络表示学习模型[J]. 计算机应用.

[1]	徐大鹏, 侯新民. 基于网络结构设计的图神经网络特征选择方法[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 663-670.
[2]	黄懿蕊, 罗俊玮, 陈景强. 基于对比学习和GIF标记的多模态对话回复检索[J]. 《计算机应用》唯一官方网站, 2024, 44(1): 32-38.
[3]	王春雷, 王肖, 刘凯. 多模态知识图谱表示学习综述[J]. 《计算机应用》唯一官方网站, 2024, 44(1): 1-15.
[4]	王静红, 周志霞, 王辉, 李昊康. 双路自编码器的属性网络表示学习[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2338-2344.
[5]	王菁怡, 李超, 宋衡, 李迪, 朱俊武. 基于随机游走算法的频谱组合拍卖机制[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2352-2357.
[6]	张琨, 杨丰玉, 钟发, 曾广东, 周世健. 基于混合代码表示的源代码脆弱性检测[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2517-2526.
[7]	张忠平, 郭鑫, 张玉停, 张睿博. 基于全息图平稳分布因子的离群点检测算法[J]. 《计算机应用》唯一官方网站, 2023, 43(6): 1705-1712.
[8]	富坤, 郝玉涵, 孙明磊, 刘赢华. 基于优化图结构自编码器的网络表示学习[J]. 《计算机应用》唯一官方网站, 2023, 43(10): 3054-3061.
[9]	毕以镇, 马焕, 张长青. 增广模态收益动态评估方法[J]. 《计算机应用》唯一官方网站, 2023, 43(10): 3099-3106.
[10]	杜航原, 郝思聪, 王文剑. 结合图自编码器与聚类的半监督表示学习方法[J]. 《计算机应用》唯一官方网站, 2022, 42(9): 2643-2651.
[11]	周乐, 代婷婷, 李淳, 谢军, 楚博策, 李峰, 张君毅, 刘峤. 基于节点-属性二部图的网络表示学习模型[J]. 《计算机应用》唯一官方网站, 2022, 42(8): 2311-2318.
[12]	孙焕良, 彭程, 刘俊岭, 许景科. 面向“15分钟生活圈”社区结构的表示学习[J]. 《计算机应用》唯一官方网站, 2022, 42(6): 1782-1788.
[13]	焦守龙, 段友祥, 孙歧峰, 庄子浩, 孙琛皓. 融合实体描述信息和邻居节点特征的知识表示学习方法[J]. 《计算机应用》唯一官方网站, 2022, 42(4): 1050-1056.
[14]	富坤, 高金辉, 赵晓梦, 李佳宁. 融合全局结构信息的拓扑优化图卷积网络[J]. 《计算机应用》唯一官方网站, 2022, 42(2): 357-364.
[15]	陈广福, 王海波, 连雁平. 基于高阶自包含协同过滤的有向网络链路预测[J]. 《计算机应用》唯一官方网站, 2022, 42(10): 3060-3068.