Journal of Computer Applications ›› 2019, Vol. 39 ›› Issue (11): 3198-3203.DOI: 10.11772/j.issn.1001-9081.2019051143

• The 2019 CCF Conference on Artificial Intelligence (CCFAI2019) • Previous Articles     Next Articles

Cross-social network user alignment algorithm based on knowledge graph embedding

TENG Lei1,2, LI Yuan1,2, LI Zhixing1,2, HU Feng1,2   

  1. 1. College of Computer Science and Technology, Chongqing University of Posts and Telecommunications, Chongqing 400065, China;
    2. Chongqing Key Laboratory of Computing Intelligence(Chongqing University of Posts and Telecommunications), Chongqing 400065, China
  • Received:2019-05-24 Revised:2019-07-17 Online:2019-11-10 Published:2019-09-11
  • Supported by:
    This work is partially supported by National Key Research and Development Program of China (2017YFB0802305).

基于知识图嵌入的跨社交网络用户对齐算法

滕磊1,2, 李苑1,2, 李智星1,2, 胡峰1,2   

  1. 1. 重庆邮电大学 计算机科学与技术学院, 重庆 400065;
    2. 计算智能重庆市重点实验室(重庆邮电大学), 重庆 400065
  • 通讯作者: 李智星
  • 作者简介:滕磊(1995-),男,重庆人,硕士研究生,主要研究方向:机器学习、数据挖掘;李苑(1992-),女,天津人,硕士,CCF会员,主要研究方向:网络安全、深度学习;李智星(1985-),男,湖南平江人,副教授,博士,CCF会员,主要研究方向:自然语言处理、知识图谱、数据挖掘、机器学习;胡峰(1978-),男,湖北天门人,教授,博士,CCF会员,主要研究方向:数据挖掘、粗糙集、粒计算。
  • 基金资助:
    国家重点研发计划项目(2017YFB0802305)。

Abstract: Aiming at the poor network embedding performance of cross-social network user alignment algorithm and the inability to guarantee the quality of negative samples generated by negative sampling method, a cross-social network KGEUA (Knowledge Graph Embedding User Alignment) algorithm was proposed. In the embedding stage, some known anchor user pairs were used for the positive sample expansion, and the Near_K negative sampling method was proposed to generate negative examples. Finally, the two social networks were embedded into a unified low-dimensional vector space with the knowledge graph embedding method. In the alignment stage, the existing user similarity measurement method was improved, the proposed structural similarity was combined with the traditional cosine similarity to measure the user similarity jointly, and an adaptive threshold-based greedy matching method was proposed to align users. Finally, the newly aligned user pairs were added to the training set to continuously optimize the vector space. The experimental results show that the proposed algorithm has the hits@30 value of 67.7% on the Twitter-Foursquare dataset, which is 3.3 to 34.8 percentage points higher than that of the state-of-the-art algorithm, improving the user alignment performance effectively.

Key words: user alignment, social network, network embedding, negative sampling, similarity measure

摘要: 针对目前跨社交网络用户对齐算法存在的网络嵌入效果不佳、负采样方法所生成负例质量无法保证等问题,提出一种基于知识图嵌入的跨社交网络用户对齐(KGEUA)算法。在嵌入阶段,利用部分已知的种子锚用户对进行正例扩充,并提出Near_K负采样方法生成负例,最后利用知识图嵌入方法将两个社交网络嵌入到统一的低维向量空间中。在对齐阶段,针对目前的用户相似度度量方法进行改进,将提出的结构相似度与传统的余弦相似度结合共同度量用户相似度,并提出基于自适应阈值的贪心匹配方法对齐用户,最后将新对齐的用户对加入到训练集中以持续优化向量空间。实验结果表明,提出的算法在Twitter-Foursquare数据集上的hits@30值达到了67.7%,比用户对齐现有最佳算法的结果高出3.3~34.8个百分点,显著提升用户对齐效果。

关键词: 用户对齐, 社交网络, 网络嵌入, 负采样, 相似度度量

CLC Number: