《计算机应用》唯一官方网站 ›› 2024, Vol. 44 ›› Issue (10): 3260-3266.DOI: 10.11772/j.issn.1001-9081.2023101557
• 第40届CCF中国数据库学术会议(NDBC 2023) • 上一篇 下一篇
收稿日期:
2023-11-13
修回日期:
2023-12-28
接受日期:
2024-01-02
发布日期:
2024-10-15
出版日期:
2024-10-10
通讯作者:
朱小飞
作者简介:
项能强(1998—),男,四川达州人,硕士研究生,CCF会员,主要研究方向:自然语言处理、社交网络基金资助:
Nengqiang XIANG, Xiaofei ZHU(), Zhaoze GAO
Received:
2023-11-13
Revised:
2023-12-28
Accepted:
2024-01-02
Online:
2024-10-15
Published:
2024-10-10
Contact:
Xiaofei ZHU
About author:
XIANG Nengqiang, born in 1998, M. S. candidate. His research interests include natural language processing, social network.Supported by:
摘要:
针对现有的信息传播预测模型难以挖掘用户对级联的依赖关系的问题,提出一种原型感知双通道图卷积神经网络(PDGCN)的信息传播预测模型。首先,使用超图卷积网络(HGCN)学习基于级联超图级的用户表示和级联表示,同时使用图卷积网络(GCN)学习基于动态友谊转发图的用户表示;其次,对于给定的目标级联,分别从上述2个级别的用户表示中查找符合当前级联的用户表示,并融合这两种表示;再次,通过聚类算法得到级联表示的原型;最后,查找当前级联最匹配的原型,并使用该原型融入当前级联的每个用户表示,从而计算候选用户的传播概率。相较于记忆增强的顺序超图注意网络(MS-HGAT),在Twitter数据集上,PDGCN的Hits@100提升了1.17%,MAP@100提升了5.02%;在Android数据集上,PDGCN的Hits@100提升了3.88%,MAP@100提升了0.72%。实验结果表明,所提模型在信息传播预测任务上优于对比模型,具有更好的预测性能。
中图分类号:
项能强, 朱小飞, 高肇泽. 原型感知双通道图卷积神经网络的信息传播预测模型[J]. 计算机应用, 2024, 44(10): 3260-3266.
Nengqiang XIANG, Xiaofei ZHU, Zhaoze GAO. Information diffusion prediction model of prototype-aware dual-channel graph convolutional neural network[J]. Journal of Computer Applications, 2024, 44(10): 3260-3266.
数据集 | 用户数 | 友谊关系 | 级联信息 | |||
---|---|---|---|---|---|---|
关系边数 | 密度 | 级联数 | 平均长度 | 密度 | ||
12 627 | 309 631 | 24.52 | 3 442 | 32.60 | 8.89 | |
Douban | 12 232 | 396 580 | 30.21 | 3 475 | 21.76 | 6.18 |
Android | 9 958 | 48 573 | 4.87 | 679 | 33.30 | 2.27 |
Christianity | 2 897 | 35 624 | 12.30 | 589 | 22.90 | 4.66 |
表1 数据集统计
Tab. 1 Dataset statistics
数据集 | 用户数 | 友谊关系 | 级联信息 | |||
---|---|---|---|---|---|---|
关系边数 | 密度 | 级联数 | 平均长度 | 密度 | ||
12 627 | 309 631 | 24.52 | 3 442 | 32.60 | 8.89 | |
Douban | 12 232 | 396 580 | 30.21 | 3 475 | 21.76 | 6.18 |
Android | 9 958 | 48 573 | 4.87 | 679 | 33.30 | 2.27 |
Christianity | 2 897 | 35 624 | 12.30 | 589 | 22.90 | 4.66 |
模型 | Douban | Android | Christianity | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Hits@10 | Hits@50 | Hits@100 | Hits@10 | Hits@50 | Hits@100 | Hits@10 | Hits@50 | Hits@100 | Hits@10 | Hits@50 | Hits@100 | |
DeepDiffuse | 5.79 | 10.80 | 18.39 | 9.02 | 14.93 | 19.13 | 4.13 | 10.58 | 17.21 | 10.27 | 21.83 | 30.74 |
Topo-LSTM | 8.45 | 15.80 | 25.42 | 8.57 | 16.53 | 21.47 | 4.56 | 12.63 | 16.53 | 12.28 | 22.63 | 31.52 |
NDM | 15.21 | 28.23 | 32.30 | 10.00 | 21.13 | 30.14 | 4.85 | 14.24 | 18.97 | 15.41 | 31.36 | 45.86 |
SNIDSA | 25.37 | 36.64 | 42.89 | 16.23 | 27.24 | 35.59 | 5.63 | 15.22 | 20.93 | 17.74 | 34.58 | 48.76 |
FOREST | 28.67 | 42.07 | 49.75 | 19.50 | 32.03 | 39.08 | 9.68 | 17.73 | 24.08 | 24.85 | 42.01 | 51.28 |
Inf-VAE | 14.85 | 32.72 | 45.72 | 8.94 | 22.02 | 35.72 | 5.98 | 14.70 | 20.91 | 18.38 | 38.50 | 51.05 |
DyHGCN | 31.88 | 45.05 | 52.19 | 18.71 | 32.33 | 39.71 | 9.10 | 16.38 | 23.09 | 26.62 | 42.80 | 52.47 |
MS-HGAT | 33.50 | 49.59 | 58.91 | 21.33 | 35.25 | 42.75 | 10.41 | 20.31 | 27.55 | 28.80 | 47.14 | 55.62 |
PDGCN | 35.30* | 50.98* | 59.60* | 22.80* | 36.23* | 44.06* | 10.71# | 20.48# | 28.62* | 29.19# | 47.58# | 56.01# |
表2 模型在4个数据集上的Hits@k得分 (k=10,50,100) (%)
Tab. 2 Hits@k score (k =10,50,100) of models on four datasets
模型 | Douban | Android | Christianity | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Hits@10 | Hits@50 | Hits@100 | Hits@10 | Hits@50 | Hits@100 | Hits@10 | Hits@50 | Hits@100 | Hits@10 | Hits@50 | Hits@100 | |
DeepDiffuse | 5.79 | 10.80 | 18.39 | 9.02 | 14.93 | 19.13 | 4.13 | 10.58 | 17.21 | 10.27 | 21.83 | 30.74 |
Topo-LSTM | 8.45 | 15.80 | 25.42 | 8.57 | 16.53 | 21.47 | 4.56 | 12.63 | 16.53 | 12.28 | 22.63 | 31.52 |
NDM | 15.21 | 28.23 | 32.30 | 10.00 | 21.13 | 30.14 | 4.85 | 14.24 | 18.97 | 15.41 | 31.36 | 45.86 |
SNIDSA | 25.37 | 36.64 | 42.89 | 16.23 | 27.24 | 35.59 | 5.63 | 15.22 | 20.93 | 17.74 | 34.58 | 48.76 |
FOREST | 28.67 | 42.07 | 49.75 | 19.50 | 32.03 | 39.08 | 9.68 | 17.73 | 24.08 | 24.85 | 42.01 | 51.28 |
Inf-VAE | 14.85 | 32.72 | 45.72 | 8.94 | 22.02 | 35.72 | 5.98 | 14.70 | 20.91 | 18.38 | 38.50 | 51.05 |
DyHGCN | 31.88 | 45.05 | 52.19 | 18.71 | 32.33 | 39.71 | 9.10 | 16.38 | 23.09 | 26.62 | 42.80 | 52.47 |
MS-HGAT | 33.50 | 49.59 | 58.91 | 21.33 | 35.25 | 42.75 | 10.41 | 20.31 | 27.55 | 28.80 | 47.14 | 55.62 |
PDGCN | 35.30* | 50.98* | 59.60* | 22.80* | 36.23* | 44.06* | 10.71# | 20.48# | 28.62* | 29.19# | 47.58# | 56.01# |
模型 | Douban | Android | Christianity | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Hits@10 | Hits@50 | Hits@100 | Hits@10 | Hits@50 | Hits@100 | Hits@10 | Hits@50 | Hits@100 | Hits@10 | Hits@50 | Hits@100 | |
DeepDiffuse | 5.87 | 6.80 | 6.39 | 6.02 | 6.93 | 7.13 | 2.30 | 2.53 | 2.56 | 7.27 | 7.83 | 7.84 |
Topo-LSTM | 8.51 | 12.68 | 13.68 | 6.57 | 7.53 | 7.78 | 3.60 | 4.05 | 4.06 | 7.93 | 8.67 | 9.86 |
NDM | 12.41 | 13.23 | 14.30 | 8.24 | 8.73 | 9.14 | 2.01 | 2.22 | 2.93 | 7.41 | 7.68 | 7.86 |
SNIDSA | 15.34 | 16.64 | 16.89 | 10.02 | 11.24 | 11.59 | 2.98 | 3.24 | 3.97 | 8.69 | 8.94 | 9.72 |
FOREST | 19.60 | 20.21 | 21.75 | 11.26 | 11.84 | 11.94 | 5.83 | 6.17 | 6.26 | 14.64 | 15.45 | 15.58 |
Inf-VAE | 19.80 | 20.66 | 21.32 | 11.02 | 11.28 | 12.28 | 4.82 | 4.86 | 5.27 | 9.25 | 11.96 | 12.45 |
DyHGCN | 20.87 | 21.48 | 21.58 | 10.61 | 11.26 | 11.36 | 6.09 | 6.40 | 6.50 | 15.64 | 16.30 | 16.44 |
MS-HGAT | 22.49 | 23.17 | 23.30 | 11.72 | 12.52 | 12.60 | 6.39 | 6.87 | 6.96 | 17.44 | 18.27 | 18.40 |
PDGCN | 23.64* | 24.35* | 24.47* | 12.13# | 12.76# | 12.87# | 6.47# | 6.89# | 7.01# | 17.75# | 18.54# | 18.68# |
表3 模型在4个数据集上的MAP@k得分 (k=10,50,100) (%)
Tab. 3 MAP@k scores (k =10,50,100) of models on four datasets
模型 | Douban | Android | Christianity | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Hits@10 | Hits@50 | Hits@100 | Hits@10 | Hits@50 | Hits@100 | Hits@10 | Hits@50 | Hits@100 | Hits@10 | Hits@50 | Hits@100 | |
DeepDiffuse | 5.87 | 6.80 | 6.39 | 6.02 | 6.93 | 7.13 | 2.30 | 2.53 | 2.56 | 7.27 | 7.83 | 7.84 |
Topo-LSTM | 8.51 | 12.68 | 13.68 | 6.57 | 7.53 | 7.78 | 3.60 | 4.05 | 4.06 | 7.93 | 8.67 | 9.86 |
NDM | 12.41 | 13.23 | 14.30 | 8.24 | 8.73 | 9.14 | 2.01 | 2.22 | 2.93 | 7.41 | 7.68 | 7.86 |
SNIDSA | 15.34 | 16.64 | 16.89 | 10.02 | 11.24 | 11.59 | 2.98 | 3.24 | 3.97 | 8.69 | 8.94 | 9.72 |
FOREST | 19.60 | 20.21 | 21.75 | 11.26 | 11.84 | 11.94 | 5.83 | 6.17 | 6.26 | 14.64 | 15.45 | 15.58 |
Inf-VAE | 19.80 | 20.66 | 21.32 | 11.02 | 11.28 | 12.28 | 4.82 | 4.86 | 5.27 | 9.25 | 11.96 | 12.45 |
DyHGCN | 20.87 | 21.48 | 21.58 | 10.61 | 11.26 | 11.36 | 6.09 | 6.40 | 6.50 | 15.64 | 16.30 | 16.44 |
MS-HGAT | 22.49 | 23.17 | 23.30 | 11.72 | 12.52 | 12.60 | 6.39 | 6.87 | 6.96 | 17.44 | 18.27 | 18.40 |
PDGCN | 23.64* | 24.35* | 24.47* | 12.13# | 12.76# | 12.87# | 6.47# | 6.89# | 7.01# | 17.75# | 18.54# | 18.68# |
模型 | Android | |||
---|---|---|---|---|
Hits@100 | MAP@100 | Hits@100 | MAP@100 | |
w/o 友谊转发图 | 57.92 | 22.87 | 27.77 | 6.96 |
w/o 超图用户表示 | 58.82 | 23.66 | 27.83 | 6.88 |
w/o 原型融合 | 59.15 | 24.01 | 28.01 | 6.81 |
PDGCN | 59.60 | 24.47 | 28.62 | 7.01 |
表4 消融实验结果 (%)
Tab. 4 Results of ablation experiments
模型 | Android | |||
---|---|---|---|---|
Hits@100 | MAP@100 | Hits@100 | MAP@100 | |
w/o 友谊转发图 | 57.92 | 22.87 | 27.77 | 6.96 |
w/o 超图用户表示 | 58.82 | 23.66 | 27.83 | 6.88 |
w/o 原型融合 | 59.15 | 24.01 | 28.01 | 6.81 |
PDGCN | 59.60 | 24.47 | 28.62 | 7.01 |
模型 | Android | |||
---|---|---|---|---|
总参数量 | 推理时间/s | 总参数量 | 推理时间/s | |
MS-HGAT | 42 901 456 | 8.056 7 | 39 951 756 | 0.377 8 |
PDGCN | 167 277 559 | 8.419 4 | 164 086 259 | 0.415 1 |
表5 不同模型的计算效率
Tab. 5 Computational efficiencies of different models
模型 | Android | |||
---|---|---|---|---|
总参数量 | 推理时间/s | 总参数量 | 推理时间/s | |
MS-HGAT | 42 901 456 | 8.056 7 | 39 951 756 | 0.377 8 |
PDGCN | 167 277 559 | 8.419 4 | 164 086 259 | 0.415 1 |
1 | WU L, RAO Y, ZHAO Y, et al. DTCA: decision tree-based co-attention networks for explainable claim verification[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2020:1024-1035. |
2 | YANG C, WU Q, GAO X, et al. EPOC: a survival perspective early pattern detection model for outbreak cascades[C]// Proceedings of the 2018 International Conference on Database and Expert Systems Applications, LNCS 11029. Cham: Springer, 2018: 336-351. |
3 | 吴运兵,高航,曾炜森,等. 融合用户关系表示和信息传播拓扑特征的信息传播预测[J/OL]. 计算机科学与探索,1-13 (2023-11-07) [2023-12-13]. . |
WU Y B, GAO H, ZENG W S, et al. Integrating user relation representations and information diffusion topology features for information propagation prediction[J/OL]. Journal of Frontiers of Computer Science and Technology, 1-13 (2023-11-07) [2023-12-13]. . | |
4 | GAO S, MA J, CHEN Z. Effective and effortless features for popularity prediction in microblogging network[C]// Companion: Proceedings of the 23rd International Conference on World Wide Web. New York: ACM, 2014: 269-270. |
5 | CHENG J, ADAMIC L, DOW P A, et al. Can cascades be predicted?[C]// Companion: Proceedings of the 23rd International Conference on World Wide Web. New York: ACM, 2014: 925-936. |
6 | ZHAO Q, ERDOGDU M A, HE H Y, et al. SEISMIC: a self-exciting point process model for predicting tweet popularity[C]// Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2015: 1513-1522. |
7 | BAO P. Modeling and predicting popularity dynamics via an influence-based self-excited Hawkes process[C]// Proceedings of the 25th ACM International Conference on Information and Knowledge Management. New York: ACM, 2016: 1897-1900. |
8 | ISLAM M R, MUTHIAH S, ADHIKARI B, et al. DeepDiffuse: predicting the “who” and “when” in cascades[C]// Proceedings of the 2018 IEEE International Conference on Data Mining. Piscataway: IEEE, 2018: 1055-1060. |
9 | ZHOU J, CUI G, HU S, et al. Graph neural networks: a review of methods and applications[J]. AI Open, 2020, 1: 57-81. |
10 | WANG J, ZHENG V W, LIU Z, et al. Topological recurrent neural network for diffusion prediction[C]// Proceedings of the 2017 IEEE International Conference on Data Mining. Piscataway: IEEE, 2017: 475-484. |
11 | YANG C, SUN M, LIU H, et al. Neural diffusion model for microscopic cascade study[J]. IEEE Transactions on Knowledge and Data Engineering, 2021, 33(3): 1128-1139. |
12 | WANG Z, CHEN C, LI W. A sequential neural information diffusion model with structure attention[C]// Proceedings of the 27th ACM International Conference on Information and Knowledge Management. New York: ACM, 2018: 1795-1798. |
13 | YANG C, TANG J, SUN M, et al. Multi-scale information diffusion prediction with reinforced recurrent networks[C]// Proceedings of the 28th International Joint Conference on Artificial Intelligence. California: ijcai.org, 2019: 4033-4039. |
14 | SANKAR A, ZHANG X, KRISHNAN A, et al. Inf-VAE: a variational autoencoder framework to integrate homophily and influence in diffusion prediction[C]// Proceedings of the 13th International Conference on Web Search and Data Mining. New York: ACM, 2020: 510-518. |
15 | YUAN C, LI J, ZHOU W, et al. DyHGCN: a dynamic heterogeneous graph convolutional network to learn users’ dynamic preferences for information diffusion prediction[C]// Proceedings of the 2020 European Conference on Machine Learning and Knowledge Discovery in Databases, LNCS 12459. Cham: Springer, 2021: 347-363. |
16 | SUN L, RAO Y, ZHANG X, et al. MS-HGAT: memory-enhanced sequential hypergraph attention network for information diffusion prediction[C]// Proceedings of the 36th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2022: 4156-4164. |
17 | YADATI N, NIMISHAKAVI M, YADAV P, et al. HyperGCN: a new method for training graph convolutional networks on hypergraphs[C]// Proceedings of the 33rd International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2019: 1511-1522. |
18 | KIPF T N, WELLING M. Semi-supervised classification with graph convolutional networks[EB/OL]. (2017-02-22) [2023-05-13].. |
19 | VELIČKOVIĆ P, CUCURULL G, CASANOVA A, et al. Graph attention networks[EB/OL]. (2018-02-04) [2023-05-13]. . |
20 | HAMILTON W L, YING R, LESKOVEC J. Inductive representation learning on large graphs[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2017: 1025-1035. |
21 | ZHU Y, XU Y, YU F, et al. Graph contrastive learning with adaptive augmentation[C]// Proceedings of the Web Conference 2021. New York: ACM, 2021: 2069-2080. |
22 | LI J, ZHOU P, XIONG C, et al. Prototypical contrastive learning of unsupervised representations[EB/OL]. (2021-03-30) [2023-05-13].. |
23 | TAN Y, LONG G, LIU L, et al. FedProto: federated prototype learning across heterogeneous clients[C]// Proceedings of the 36th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2022: 8432-8440. |
24 | 杜炎,吕良福,焦一辰. 基于模糊推理的模糊原型网络[J]. 计算机应用, 2021, 41(7): 1885-1890. |
DU Y, LYU L F, JIAO Y C. Fuzzy prototype network based on fuzzy reasoning[J]. Journal of Computer Applications, 2021, 41(7): 1885-1890. | |
25 | HODAS N O, LERMAN K. The simple rules of social contagion[J]. Scientific Reports, 2014, 4: No.4343. |
26 | ZHONG E, FAN W, WANG J, et al. ComSoc: adaptive transfer of user behaviors over composite social network[C]// Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2012: 696-704. |
[1] | 杜郁, 朱焱. 构建预训练动态图神经网络预测学术合作行为消失[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2726-2731. |
[2] | 庞川林, 唐睿, 张睿智, 刘川, 刘佳, 岳士博. D2D通信系统中基于图卷积网络的分布式功率控制算法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2855-2862. |
[3] | 薛桂香, 王辉, 周卫峰, 刘瑜, 李岩. 基于知识图谱和时空扩散图卷积网络的港口交通流量预测[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2952-2957. |
[4] | 刘禹含, 吉根林, 张红苹. 基于骨架图与混合注意力的视频行人异常检测方法[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2551-2557. |
[5] | 李欢欢, 黄添强, 丁雪梅, 罗海峰, 黄丽清. 基于多尺度时空图卷积网络的交通出行需求预测[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2065-2072. |
[6] | 刘越, 刘芳, 武奥运, 柴秋月, 王天笑. 基于自注意力机制与图卷积的3D目标检测网络[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1972-1977. |
[7] | 吕锡婷, 赵敬华, 荣海迎, 赵嘉乐. 基于Transformer和关系图卷积网络的信息传播预测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1760-1766. |
[8] | 黎施彬, 龚俊, 汤圣君. 基于Graph Transformer的半监督异配图表示学习模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1816-1823. |
[9] | 高龙涛, 李娜娜. 基于方面感知注意力增强的方面情感三元组抽取[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1049-1057. |
[10] | 杨先凤, 汤依磊, 李自强. 基于交替注意力机制和图卷积网络的方面级情感分析模型[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1058-1064. |
[11] | 王楷天, 叶青, 程春雷. 基于异构图表示的中医电子病历分类方法[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 411-417. |
[12] | 吴祖成, 吴小俊, 徐天阳. 基于模态内细粒度特征关系提取的图像文本检索模型[J]. 《计算机应用》唯一官方网站, 2024, 44(12): 3776-3783. |
[13] | 高瑞, 陈学斌, 张祖篡. 面向部分图更新的动态社交网络隐私发布方法[J]. 《计算机应用》唯一官方网站, 2024, 44(12): 3831-3838. |
[14] | 高颖杰, 林民, 斯日古楞null, 李斌, 张树钧. 基于片段抽取原型网络的古籍文本断句标点提示学习方法[J]. 《计算机应用》唯一官方网站, 2024, 44(12): 3815-3822. |
[15] | 梁睿衍, 杨慧. 基于RPEpose和XJ-GCN的轻量级跌倒检测算法框架[J]. 《计算机应用》唯一官方网站, 2024, 44(11): 3639-3646. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||