计算机应用 ›› 2017, Vol. 37 ›› Issue (12): 3435-3441.DOI: 10.11772/j.issn.1001-9081.2017.12.3435

• 网络空间安全 • 上一篇    下一篇

基于带权超图的跨网络用户身份识别方法

徐乾, 陈鸿昶, 吴铮, 黄瑞阳   

  1. 国家数字交换系统工程技术研究中心, 郑州 450002
  • 收稿日期:2017-05-23 修回日期:2017-08-10 出版日期:2017-12-10 发布日期:2017-12-18
  • 通讯作者: 徐乾
  • 作者简介:徐乾(1993-),男,辽宁大连人,硕士研究生,主要研究方向:社交网络挖掘、机器学习;陈鸿昶(1964-),男,河南郑州人,教授,博士,主要研究方向:电信网信息关防、信息通信安全;吴铮(1992-),男,江苏徐州人,硕士研究生,主要研究方向:复杂网络、网络大数据分析与处理;黄瑞阳(1986-),男,福建漳州人,副研究员,博士,主要研究方向:网络大数据分析与处理、大数据分布式处理。
  • 基金资助:
    国家自然科学基金资助项目(61521003)。

User identification method across social networks based on weighted hypergraph

XU Qian, CHEN Hongchang, WU Zheng, HUANG Ruiyang   

  1. National Digital Switching System Engineering & Technological Research Center, Zhengzhou Henan 450002, China
  • Received:2017-05-23 Revised:2017-08-10 Online:2017-12-10 Published:2017-12-18
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (61521003).

摘要: 随着各种社交网络的不断涌现,越来越多的研究者开始从多源的角度分析社交网络数据,多社交网络的数据融合依赖于跨网络用户身份识别。针对现有的基于好友关系(FRUI)算法对社交网络中的异质关系利用率不高的问题,提出了基于带权超图的跨网络用户身份识别(WHUI)算法。首先,通过在好友关系网络上构建带权超图来准确地描述同一网络中的好友关系及异质关系,以此提高表示节点所处拓扑环境的准确性;然后,在构建好的带权超图的基础上,根据节点所处拓扑环境在不同网络中大致相同这一特性,定义节点之间的跨网络相似性;最后,结合迭代匹配算法,每次选取跨网络相似性最高的用户对进行匹配,并加入双向认证和结果剪枝来保证识别准确率。在合作网络DBLP和真实社交网络上进行了实验,实验结果表明,在真实社交网络上,所提算法相比FRUI算法,平均准确率提高了5.5个百分点,平均召回率提高了3.4个百分点,平均F值提高了4.6个百分点。在只有网络拓扑信息的情况下,所提WHUI算法有效提高了实际应用中身份识别的准确率和召回率。

关键词: 跨网络用户身份识别, 带权超图, 异质关系, 节点相似度, 迭代匹配

Abstract: With the emergence of various social networks, the social media network data is analyzed from the perspective of variety by more and more researchers. The data fusion of multiple social networks relies on user identification across social networks. Concerning the low utilization problem of heterogeneous relation between social networks of the traditional Friend Relationship-based User Identification (FRUI) algorithm, a new Weighted Hypergraph based User Identification (WHUI) algorithm across social networks was proposed. Firstly, the weighted hypergraph was accurately constructed on the friend relation networks to describe the friend relation and the heterogeneous relation in the same network, which improved the accuracy of presenting topological environment of nodes. Then, on the basis of the constructed weighted hypergraph, the cross network similarity between nodes was defined according to the consistency of nodes' topological environment in different networks. Finally, the user pair with the highest cross network similarity was chosen to match each time by combining with the iterative matching algorithm, while two-way authentication and result pruning were added to ensure the recognition accuracy. The experiments were carried out in the DBLP cooperation networks and real social networks. The experimental results show that, compared with the existing FRUI algorithm, the average precision, recall, F of the proposed algorithm is respectively improved by 5.5 percentage points, 3.4 percentage points, 4.6 percentage points in the real social networks. The WHUI algorithm can effectively improve the precision and recall of user identification in practical applications by utilizing only network topology information.

Key words: user identification across social network, weighted hypergraph, heterogeneous relation, node similarity, iterative matching

中图分类号: