Journal of Computer Applications ›› 2020, Vol. 40 ›› Issue (2): 441-447.DOI: 10.11772/j.issn.1001-9081.2019081529

• CCF NDBC 2019 • Previous Articles     Next Articles

Representation learning for topic-attention network

Jingfeng GUO1,2, Hui DONG1,2(), Tingwei ZHANG1,2, Xiao CHEN3   

  1. 1.College of Information Science and Engineering,Yanshan University,Qinhuangdao Hebei 066004,China
    2.Key Laboratory for Computer Virtual Technology and System Integration of Hebei Province,Qinhuangdao Hebei 066004,China
    3.Network Technology Center,Hebei Normal University of Science and Technology,Qinhuangdao Hebei 066004,China
  • Received:2019-08-12 Revised:2019-09-10 Accepted:2019-10-24 Online:2019-11-04 Published:2020-02-10
  • Contact: Hui DONG
  • About author:GUO Jingfeng, born in 1962, Ph. D., professor. His research interests include database, data mining, social network analysis.
    ZHANG Tingwei, born in 1995, M. S. candidate. His research interests include network representation learning, social network analysis.
    CHEN Xiao, born in 1983, Ph. D., research assistant. Her research interests include graph mining, network representation learning, social network analysis.
  • Supported by:
    the National Natural Science Foundation of China(61472340);the National Youth Science Foundation of Hebei(F2017209070);the Doctoral Research Start-up Fund(Natural Science) of Hebei Normal University of Science and Technology(2019YB011);the Natural Science Foundation of Hebei(F2019203157);the Key Projects of Science and Technology Research in Colleges and Universities of Hebei Province(ZD2019004)

主题关注网络的表示学习

郭景峰1,2, 董慧1,2(), 张庭玮1,2, 陈晓3   

  1. 1.燕山大学 信息科学与工程学院,河北 秦皇岛 066004
    2.河北省计算机虚拟技术与系统集成重点实验室,河北 秦皇岛 066004
    3.河北科技师范学院 网络技术中心,河北 秦皇岛 066004
  • 通讯作者: 董慧
  • 作者简介:郭景峰(1962—),男,黑龙江哈尔滨人,教授 ,博士,CCF会员,主要研究方向:数据库、数据挖掘、社会网络分析
    张庭玮(1995—),男,河北秦皇岛人,硕士研究生,主要研究方向:网络表示学习、社会网络分析
    陈晓(1983—),女,河北秦皇岛人,助理研究员,博士,CCF会员,主要研究方向:图挖掘、网络表示学习、社会网络分析。
  • 基金资助:
    国家自然科学基金资助项目(61472340);河北省青年科学基金资助项目(F2017209070);河北科技师范学院博士研究启动基金(自然科学)资助项目(2019YB011);河北省自然科学基金资助项目(F2019203157);河北省高等学校科学技术研究项目重点项目(ZD2019004)

Abstract:

Concerning the problem that heterogeneous network representation learning only considers social relations in structure and ignores semantics, combining the social relationship between users and the preference of users for topics, a representation learning algorithm based on topic-attention network was proposed. Firstly, according to the characteristics of the topic-attention network and combining with the idea of the identical-discrepancy-contrary (determination and uncertainty) of set pair analysis theory, the transition probability model was given. Then, a random walk algorithm based on two types of nodes was proposed by using the transition probability model, so as to obtain the relatively high-quality random walk sequence. Finally, the embedding vector space representation of the topic-attention network was obtained by modeling based on two types of nodes in the sequences. Theoretical analysis and experimental results on the Douban dataset show that the random walk algorithm combined with the transition probability model is more comprehensive in analyzing the connection relationship between nodes in the network. The modularity of the proposed algorithm is 0.699 8 when the number of the communities is 13, which is nearly 5% higher than that of metapath2vec algorithm, and can capture more detailed information in the network.

Key words: topic-attention network, set pair analysis, transition probability, random walk, representation learning

摘要:

针对异质网络表示学习仅从结构方面考虑社交关系而忽略语义这一问题,结合用户间的社交关系和用户对主题的偏好两个方面,提出基于主题关注网络的表示学习算法。首先,针对主题关注网络的特点,结合集对分析理论的同异反(确定与不确定)思想,给出转移概率模型;然后,在转移概率模型的基础上提出了一种基于两类节点的随机游走算法,以得到相对高质量的随机游走序列;最后,基于序列中两类节点建模得到主题关注网络的嵌入向量空间表示。理论分析和在豆瓣数据集上的实验结果表明,结合转移概率模型的随机游走算法能更全面地分析网络中节点的连接关系,当划分社区的个数为13时,所提算法的模块度为0.699 8,相比metapath2vec算法提高了近5%,可以更详细地捕获网络中的信息。

关键词: 主题关注网络, 集对分析, 转移概率, 随机游走, 表示学习

CLC Number: