《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (11): 3457-3463.DOI: 10.11772/j.issn.1001-9081.2022111736

• 数据科学与技术 • 上一篇    

基于社区改变量估计的非均匀时间片划分方法

罗香玉1(), 闫克1, 卢琰1, 王甜1, 辛刚2   

  1. 1.西安科技大学 计算机科学与技术学院,西安 710054
    2.中国航空工业集团公司西安航空计算技术研究所,西安 710119
  • 收稿日期:2022-11-22 修回日期:2023-02-27 接受日期:2023-03-08 发布日期:2023-03-20 出版日期:2023-11-10
  • 通讯作者: 罗香玉
  • 作者简介:罗香玉(1984—),女,河北邢台人,副教授,博士,CCF会员,主要研究方向:图计算、复杂网络 luoxiangyu@xust.edu.cn
    闫克(1994—),男,河南南阳人,硕士研究生,主要研究方向:社区演化分析
    卢琰(1998—),女,河南驻马店人,硕士研究生,主要研究方向:社区演化分析、传播动力学分析
    王甜(1999—),女,陕西宝鸡人,硕士研究生,主要研究方向:社区演化分析、传播动力学分析
    辛刚(1984—),男,陕西宝鸡人,高级工程师,硕士,主要研究方向:机器学习、大数据。
  • 基金资助:
    国家自然科学基金资助项目(12071367);陕西省基础研究计划面上项目(2022JM?317)

Nonuniform time slicing method based on prediction of community variance

Xiangyu LUO1(), Ke YAN1, Yan LU1, Tian WANG1, Gang XIN2   

  1. 1.College of Computer Science and Technology,Xi’an University of Science and Technology,Xi’an Shaanxi 710054,China
    2.AVIC Xi’an Aeronautics Computing Technique Research Institute,Xi’an Shaanxi 710119,China
  • Received:2022-11-22 Revised:2023-02-27 Accepted:2023-03-08 Online:2023-03-20 Published:2023-11-10
  • Contact: Xiangyu LUO
  • About author:LUO Xiangyu, born in 1984, Ph. D., associate professor. Her research interests include graph computing, complex network.
    YAN Ke, born in 1994, M. S. candidate. His research interests include community evolution analysis.
    LU Yan, born in 1998, M. S. candidate. Her research interests include community evolution analysis, analysis of spreading dynamics.
    WANG Tian, born in 1999, M. S. candidate. Her research interests include community evolution analysis, analysis of spreading dynamics.
    XIN Gang, born in 1984, M. S., senior engineer. His research interests include machine learning, big data.
  • Supported by:
    National Natural Science Foundation of China(12071367);Program of Basic Natural Science of Shaanxi Province(2022JM-317)

摘要:

动态网络时间片划分方法对社区演化分析结果的准确性具有重要影响,但社区随时间及网络拓扑改变呈现非线性的变化,现有均匀时间片划分以及基于网络拓扑改变量的非均匀时间片划分方法在捕捉社区演化事件方面均效果不佳。为此,提出一种基于社区改变量估计的非均匀时间片划分方法,其中社区改变量通过变化后网络期望达到的社区模块度与直接应用网络变化前的社区发现结果获得的社区模块度之差来定量描述。首先,基于时间序列分析建立社区模块度预测模型;其次,使用该模型预测变化后网络期望达到的社区模块度,并求得社区改变量的估计值;最后,当该估计值超过预先设置的阈值时即生成一个新的时间片。在两个真实网络数据集上的实验结果显示,相较于传统的均匀时间片划分方法和基于网络拓扑改变量的非均匀时间片划分方法,所提方法在动态网络数据集Arxiv HEP-PH上的识别社区消失事件方面分别提早1.10 d和1.30 d,识别社区形成事件方面分别提早8.34 d和3.34 d,识别出的社区缩小、扩大事件总数分别增加10个和1个;在Sx-MathOverflow数据集上的识别社区消失事件方面分别提早3.30 d和1.80 d,识别社区形成事件方面分别提早6.41 d和2.97 d,识别出的社区缩小、扩大事件总数分别增加15个和7个。

关键词: 动态网络, 时间片划分, 社区演化, 时间序列分析, 社区发现, 社区模块度

Abstract:

Time slicing methods in dynamic networks greatly influence the accuracy of community evolution analysis results. As communities vary nonlinearly with time and network topology, both the existing uniform time slicing method and network topology variance-based nonuniform time slicing method are unsatisfactory in capturing community evolution events. Therefore, a nonuniform time slicing method based on prediction of community variance was proposed, where the community variance is quantitatively described by the difference between the community modularity expected to be achieved by the updated network and the community modularity obtained by directly applying the community detection results of the network before changing. Firstly, the prediction model of community modularity was established on the basis of time series analysis. Secondly, with the established model, the expected community modularity of the updated network was predicted, and the prediction value of community variance was obtained. Finally, once the prediction value surpassed a previously set threshold, a new time slice was generated. Experimental results on two real network datasets show that compared with the traditional uniform time slicing method and the nonuniform time slicing method based on network topology variance, on the dynamic network dataset Arxiv HEP-PH, the proposed method identifies community disappearance events 1.10 days and 1.30 days earlier, respectively, and identifies the community forming events 8.34 days and 3.34 days earlier, respectively, and the total number of identified community shrinking and growing events increased by 10 and 1 respectively. On Sx?MathOverflow dataset, the proposed method identifies community disappearance events 3.30 days and 1.80 days earlier, and identifies the community forming events 6.41 days and 2.97 days earlier respectively, and the total number of identified community shrinking and growing events increased by 15 and 7, respectively.

Key words: dynamic network, time slicing, community evolution, time series analysis, community detection, community modularity

中图分类号: