基于协作贡献网络的开源项目开发者推荐

doi:10.11772/j.issn.1001-9081.2024040454

《计算机应用》唯一官方网站 ›› 2025, Vol. 45 ›› Issue (4): 1213-1222.DOI: 10.11772/j.issn.1001-9081.2024040454

基于协作贡献网络的开源项目开发者推荐

游兰¹^,², 张雨昂¹, 刘源¹^,³, 陈智军¹^,²(), 王伟⁴, 曾星¹^,³, 何张玮¹

^1.湖北大学计算机学院，武汉 430062
^2.大数据智能分析与行业应用湖北省重点实验室（湖北大学），武汉 430062
^3.智能感知系统与安全教育部重点实验室（湖北大学），武汉 430062
^4.华东师范大学数据科学与工程学院，上海 200062

收稿日期:2024-04-16 修回日期:2024-11-06 接受日期:2024-11-07 发布日期:2025-04-08 出版日期:2025-04-10
通讯作者: 陈智军
作者简介:游兰（1978—），女，湖北武汉人，教授，博士，CCF会员，主要研究方向：开源数字生态学、时空大数据、数智孪生；
张雨昂（1997—），男，湖北武汉人，硕士研究生，主要研究方向：图神经网络、推荐系统；
刘源（2001—），男，湖北随州人，硕士研究生，主要研究方向：开源战略、软件工程；
王伟（1979—），男，上海人，教授，博士，主要研究方向：开源战略、开源测量学、开源数字生态系统；
曾星（1987—），男，湖北武汉人，讲师，博士，主要研究方向：时空大数据分析与挖掘、机器学习、人工智能；
何张玮（1998—），男，湖北武汉人，硕士研究生，主要研究方向：开源战略、软件工程。
基金资助:
湖北省重点研发计划项目（2022BAA044）。

Developer recommendation for open-source projects based on collaborative contribution network

Lan YOU¹^,², Yuang ZHANG¹, Yuan LIU¹^,³, Zhijun CHEN¹^,²(), Wei WANG⁴, Xing ZENG¹^,³, Zhangwei HE¹

^1.College of Computer Science，Hubei University，Wuhan Hubei 430062，China
^2.Hubei Key Laboratory of Big Data Intelligent Analysis and Application，Wuhan Hubei 430062，China
^3.Key Laboratory of Intelligent Sensing System and Security，Ministry of Education （Hubei University），Wuhan Hubei 430062，China
^4.School of Data Science and Engineering，East China Normal University，Shanghai 200062，China

Received:2024-04-16 Revised:2024-11-06 Accepted:2024-11-07 Online:2025-04-08 Published:2025-04-10
Contact: Zhijun CHEN
About author:YOU Lan， born in 1978， Ph. D.， professor. Her research interests include open-source digital ecology， spatio-temporal big data， intelligent digital twin.
ZHANG Yuang， born in 1997， M. S. candidate. His research interests include graph neural network， recommender system.
LIU Yuan， born in 2001， M. S. candidate. His research interests include open-source strategy， software engineering.
WANG Wei， born in 1979， Ph. D.， professor. His research interests include open-source strategy， open-source surveying， open-source digital ecosystem.
ZENG Xing， born in 1987， Ph. D.， lecturer. His research interests include spatio‑temporal big data analysis and mining， machine learning， artificial intelligence.
HE Zhangwei， born in 1998， M. S. candidate. His research interests include open-source strategy， software engineering.
Supported by:
Key Research and Development Program of Hubei Province(2022BAA044)

摘要/Abstract

摘要：

面向开源项目推荐开发人员对开源生态建设具有重要意义。区别于传统软件开发，开源领域的开发者、项目、组织及相互关系体现了开放式协作项目的特点，而它们蕴含的语义有助于精准推荐开源项目的开发者。因此，提出一种基于协作贡献网络（CCN）的开发者推荐（DRCCN）方法。首先，利用开源软件（OSS）开发者、OSS项目、OSS组织之间的贡献关系构建CCN；其次，基于CCN构建一个3层深度的异构GraphSAGE （Graph SAmple and aggreGatE）图神经网络（GNN）模型，预测开发者节点和开源项目节点之间的链接，从而产生相应的嵌入对；最后，根据预测结果，采用K最近邻（KNN）算法完成开发者推荐。在GitHub数据集上训练和测试模型的实验结果表明，相较于序列推荐的对比学习模型CL4SRec （Contrastive Learning for Sequential Recommendation），DRCCN在精确率、召回率和F1值这3个指标上分别提升了约10.7%、2.6%和4.2%。因此，所提模型可以为开源社区项目的开发者推荐提供重要的参考依据。

Abstract:

Recommending developers for open-source projects is of great significance to the construction of open-source ecology. Different from traditional software development， developers， projects， organizations and correlations in the open-source field reflect the characteristics of open collaborative projects， and their embedded semantics help to recommend developers accurately for open-source projects. Therefore， a Developer Recommendation method based on Collaborative Contribution Network （DRCCN） was proposed. Firstly， a CCN was constructed by utilizing the contribution relationships among Open-Source Software （OSS） developers， OSS projects and OSS organizations. Then， based on CCN， a three-layer deep heterogeneous GraphSAGE （Graph SAmple and aggreGatE） Graph Neural Network （GNN） model was constructed to predict the links between developer nodes and open-source project nodes， so as to generate the corresponding embedding pairs. Finally， according to the prediction results， the K-Nearest Neighbor （KNN） algorithm was adopted to complete the developer recommendation. The proposed model was trained and tested on GitHub dataset， and the experimental results show that compared to the contrastive learning model for sequential recommendation CL4SRec （Contrastive Learning for Sequential Recommendation）， DRCCN improves the precision， recall， and F1 score by approximately 10.7%， 2.6%， and 4.2%， respectively. It can be seen that the proposed model can provide important reference for the developer recommendation of open-source community projects.

Key words: open-source ecology, developer recommendation, heterogeneous information network, Graph Neural Network (GNN), Open-Source Software (OSS)

中图分类号:

TP311.5

游兰, 张雨昂, 刘源, 陈智军, 王伟, 曾星, 何张玮. 基于协作贡献网络的开源项目开发者推荐[J]. 计算机应用, 2025, 45(4): 1213-1222.

Lan YOU, Yuang ZHANG, Yuan LIU, Zhijun CHEN, Wei WANG, Xing ZENG, Zhangwei HE. Developer recommendation for open-source projects based on collaborative contribution network[J]. Journal of Computer Applications, 2025, 45(4): 1213-1222.

图/表 22

图1 总体框架

Fig. 1 Overall framework

图2 协作贡献网络的网络模式

Fig. 2 Network mode of collaborative contribution network

图3 协作贡献网络的元路径

Fig. 3 Metapaths of collaborative contribution network

图4 GraphSAGE各层节点数和平均度数统计

Fig. 4 Statistics of GraphSAGE node number and average degree of each layer

图5 3层图神经网络

Fig. 5 Three-layer graph neural network

图6 链接预测流程

Fig. 6 Link prediction flow

图7 点积相似度计算

Fig. 7 Dot product similarity calculation

图8 最近邻推荐

Fig. 8 Nearest neighbor recommendation

图9 数据预处理步骤

Fig. 9 Data preprocessing steps

图10 数据集规模对精确率的影响（K=20）

Fig. 10 Influence of dataset size on precision with K=20

图11 数据集规模对召回率的影响（K=20）

Fig. 11 Influence of dataset size on recall with K=20

图12 数据集规模对F1值的影响（K=20）

Fig. 12 Influence of dataset size on F1 score with K=20

图13 数据集中项目描述文本长度的分布

Fig. 13 Distribution of project description text lengths in dataset

图14 项目描述文本长度对精确率的影响（K=20）

Fig. 14 Influence of project description text length on precision with K=20

图15 项目描述文本长度对召回率的影响（K=20）

Fig. 15 Influence of project description text length on recall with K=20

图16 项目描述文本长度对F1值的影响（K=20）

Fig. 16 Influence of project description text length on F1 score with K=20

图17 模型层数对性能的影响

Fig. 17 Influence of model layers on performance

表1 各组不同的详细参数设置

Tab. 1 Detailed parameter setting for different combinations

参数组合序号	GNN维度	Learning rate	Weight decay	Batch size
1	128	0.001	1E-4	512
2	128	0.001	1E-4	1 024
3	128	0.001	5E-4	512
4	128	0.001	5E-4	1 024
5	128	0.010	1E-4	512
6	128	0.010	1E-4	1 024
7	256	0.001	1E-4	512
8	256	0.001	1E-4	1 024
9	256	0.001	5E-4	512
10	256	0.001	5E-4	1 024
11	256	0.010	1E-4	512
12	256	0.010	1E-4	1 024

图18 不同参数组合下的F1值及时长的实验结果

Fig. 18 Experimental results of F1 score and duration under different combination of parameters

表2 实验参数设置

Tab. 2 Experimental parameter setting

参数	值	参数	值
每层GNN非线性变换维度	256	Batch size	1 024
Learning rate	0.001	Shuffle	True
Weight decay	5E-4	Drop last	False
Epochs	50

表3 模型消融实验对比结果 (%)

Tab. 3 Comparison results of DRCCN ablation experiments

推荐方法	精确率	召回率	F1值
w/o CCN	49.78±3.24	53.45±2.36	51.66±3.12
w/o GraphSAGE	54.33±0.75	61.75±1.62	60.52±1.20
w/o Link-Prediction	59.61±1.52	65.20±2.03	63.28±1.95
DRCCN	62.29±1.87	71.30±1.35	66.41±1.72

表4 不同开发者推荐方法的对比实验结果 (%)

Tab. 4 Comparison experimental results of different developer recommendation methods

方法	精确率	召回率	F1值
DRCCN	62.29±1.87	71.30±1.35	66.41±1.72
CL4SRec	56.27±1.72	69.51±1.81	63.74±1.10
LightGCN	55.40±0.46	69.25±0.92	61.51±1.30
ConRec	54.88±1.31	65.12±1.40	59.27±1.35
GFCF	53.16±0.49	64.28±0.70	57.85±0.93
DRDPBG	51.87±2.10	60.72±2.57	55.93±1.92
S³Rec	47.85±3.02	51.18±2.47	48.92±3.14
DCF	29.45±1.20	34.96±1.17	31.90±1.32
SCF	27.13±2.62	33.81±1.95	30.17±2.32
DRCB	38.37±1.89	45.92±1.23	41.85±0.92

参考文献 40

1	FRANCO-BEDOYA O， AMELLER D， COSTAL D， et al. Open-source software ecosystems： a systematic mapping［J］. Information and Software Technology， 2017， 91： 160-185.
2	ZHAO J， WANG X， SHI C， et al. Heterogeneous graph structure learning for graph neural networks［C］// Proceedings of the 35th AAAI Conference on Artificial Intelligence. Palo Alto： AAAI Press， 2021： 4697-4705.
3	LI Z， ZHAO Y， ZHANG Y， et al. Multi-relational graph attention networks for knowledge graph completion［J］. Knowledge-Based Systems， 2022， 251： No.109262.
4	ZHANG C， SONG D， HUANG C， et al. Heterogeneous graph neural network［C］// Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York： ACM， 2019： 793-803.
5	XIA X， LO D， WANG X， et al. Accurate developer recommendation for bug resolution［C］// Proceedings of the 20th Working Conference on Reverse Engineering. Piscataway： IEEE， 2013： 72-81.
6	SUN X， YANG H， XIA X， et al. Enhancing developer recommendation with supplementary information via mining historical commits［J］. Journal of Systems and Software， 2017， 134： 355-368.
7	XIE X， ZHANG W， YANG Y， et al. DRETOM： developer recommendation based on topic models for bug resolution［C］// Proceedings of the 8th International Conference on Predictive Models in Software Engineering. New York： ACM， 2012： 19-28.
8	ZHANG T， LEE B. A hybrid bug triage algorithm for developer recommendation［C］// Proceedings of the 28th Annual ACM Symposium on Applied Computing. New York： ACM， 2013： 1088-1094.
9	刘娇. 软件缺陷修复者推荐方法的研究［D］. 重庆：重庆邮电大学， 2018： 1-57.
	LIU J. The research on the fixer recommendation method of software bugs［D］. Chongqing： Chongqing University of Posts and Telecommunications， 2018：1-57.
10	ZHANG Z， SUN H， ZHANG H. Developer recommendation for Topcoder through a meta-learning based policy model［J］. Empirical Software Engineering， 2020， 25（1）： 859-889.
11	ZHU J， SHEN B， HU F. A learning to rank framework for developer recommendation in software crowdsourcing［C］// Proceedings of the 2015 Asia-Pacific Software Engineering Conference. Piscataway： IEEE， 2015： 285-292.
12	于旭，何亚东，杜军威，等. 一种结合显式特征和隐式特征的开发者混合推荐算法［J］. 软件学报， 2022， 33（5）： 1635-1651.
	YU X， HE Y D， DU J W， et al. Developer hybrid recommendation algorithm based on combination of explicit features and implicit features［J］. Journal of Software， 2022， 33（5）： 1635-1651.
13	MAO T， YOSHIE O， FU J， et al. Seeing both sides： context-aware heterogeneous graph matching networks for extracting-related arguments［J］. Neural Computing and Applications， 2024， 36（9）：4741-4762.
14	刘海洋，马于涛. 一种针对软件缺陷自动分派的开发者推荐方法［J］. 小型微型计算机系统， 2017， 38（12）： 2747-2753.
	LIU H Y， MA Y T. Developer recommendation method for automatic software bug triage［J］. Journal of Chinese Computer Systems， 2017， 38（12）：2747-2753.
15	刘晔晖，赵海燕，曹健，等. 开源社区中Issue解决过程的参与者推荐方法［J］. 小型微型计算机系统， 2020， 41（9）： 1930-1934.
	LIU Y H， ZHAO H Y， CAO J， et al. Participants recommendation approaches for Issue solving process in open source community［J］. Journal of Chinese Computer Systems， 2020， 41（9）： 1930-1934.
16	李炜，吴群群，张以文. 基于E-CARGO模型的开发者推荐方法［J］. 计算机应用， 2022， 42（2）： 557-564.
	LI W， WU Q Q， ZHANG Y W. Developer recommendation method based on E-CARGO model［J］. Journal of Computer Applications， 2022， 42（2）：557-564.
17	YAN R， FAN Y， ZHANG J， et al. Service recommendation for composition creation based on collaborative attention convolutional network［C］// Proceedings of the 2021 IEEE International Conference on Web Services. Piscataway： IEEE， 2021： 397-405.
18	WANG Z， LIN G， TAN H， et al. CKAN： collaborative knowledge-aware attentive network for recommender systems［C］// Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. New York： ACM， 2020： 219-228.
19	TROUSSAS C， GIANNAKAS F， SGOUROPOULOU C， et al. Collaborative activities recommendation based on students’ collaborative learning styles using ANN and WSM［J］. Interactive Learning Environments， 2023， 31（1）： 54-67.
20	王旭东. 基于图理论的场景图检索方法研究与实现［D］. 西安：西安电子科技大学， 2021：1-61.
	WANG X D. Research and implementation of scene graph retrieval method based on graph theory［D］. Xi’an： Xidian University， 2021：1-61.
21	GAO C， WANG X， HE X， et al. Graph neural networks for recommender system［C］// Proceedings of the 15th ACM International Conference on Web Search and Data Mining. New York： ACM， 2022： 1623-1625.
22	FAN W， MA Y， LI Q， et al. Graph neural networks for social recommendation［C］// Proceedings of the 2019 World Wide Web Conference. New York： ACM， 2019： 417-426.
23	LIU Z， YANG L， FAN Z， et al. Federated social recommendation with graph neural network［J］. ACM Transactions on Intelligent Systems and Technology， 2022， 13（4）： No.55.
24	HE X， DENG K， WANG X， et al. LightGCN： simplifying and powering graph convolution network for recommendation［C］// Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. New York： ACM， 2020： 639-648.
25	DUONG T T. An improvement of cluster-GCN with constraints［D］. Hanoi： FPT University， 2023：1-47.
26	YANG X， YAN M， PAN S， et al. Simple and efficient heterogeneous graph neural network［C］// Proceedings of the 37th AAAI Conference on Artificial Intelligence. Palo Alto： AAAI Press， 2023： 10816-10824.
27	YING R， HE R， CHEN K， et al. Graph convolutional neural networks for web-scale recommender systems［C］// Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York： ACM， 2018： 974-983.
28	XIE X， SUN F， LIU Z， et al. Contrastive learning for sequential recommendation［C］// Proceedings of the IEEE 38th International Conference on Data Engineering. Piscataway： IEEE， 2022： 1259-1273.
29	WANG Z， ZHAO H， SHI C. Profiling the design space for graph neural networks based collaborative filtering［C］// Proceedings of the 15th ACM International Conference on Web Search and Data Mining. New York： ACM， 2022： 1109-1119.
30	ZHOU K， WANG H， ZHAO W X， et al. S³-Rec： self-supervised learning for sequential recommendation with mutual information maximization［C］// Proceedings of the 29th ACM International Conference on Information and Knowledge Management. New York： ACM， 2020： 1893-1902.
31	WANG X， LIU N， HAN H， et al. Self-supervised heterogeneous graph neural network with co-contrastive learning［C］// Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. New York： ACM， 2021： 1726-1736.
32	SHI C， JI H， LU Z， et al. Distance information improves heterogeneous graph neural networks［J］. IEEE Transactions on Knowledge and Data Engineering， 2024， 36（3）： 1030-1043.
33	GUAN M， CAI X， SHANG J， et al. HMSG： heterogeneous graph neural network based on Metapath SubGraph learning［J］. Knowledge-Based Systems， 2023， 279： No.110930.
34	FU X， ZHANG J， MENG Z， et al. MaGNN： metapath aggregated graph neural network for heterogeneous graph embedding［C］// Proceedings of the Web Conference 2020. New York： ACM， 2020： 2331-2341.
35	ALAOUI D EL， RIFFI J， SABRI A， et al. Deep GraphSAGE-based recommendation system： jumping knowledge connections with ordinal aggregation network［J］. Neural Computing and Applications， 2022， 34（14）： 11679-11690.
36	林海铭，田春岐，王伟. 基于二分网络表示学习的开源项目推荐方法［J］. 计算机科学与应用， 2022， 12（1）： 54-62.
	LIN H M， TIAN C Q， WANG W. Open source project recommendation method based on bipartite network representation learning［J］. Journal of Computer Science and Application， 2022， 12（1）： 54-62.
37	SCHAFER J B， FRANKOWSKI D， HERLOCKER J， et al. Collaborative filtering recommender systems［M］// BRUSILOVSKY P， KOBSA A， NEJDL W. The adaptive web： methods and strategies of web personalization， LNCS 4321. Berlin： Springer， 2007： 291-324.
38	PAZZANI M J， BILLSUS D. Content-based recommendation systems［M］// BRUSILOVSKY P， KOBSA A， NEJDL W. The adaptive web： methods and strategies of web personalization， LNCS 4321. Berlin： Springer， 2007： 325-341.
39	ZHANG X， WANG T， YIN G， et al. Who will be interested in？ a contributor recommendation approach for open-source projects［EB/OL］. ［2024-02-11］..
40	YU J， XIA X， CHEN T， et al. XSimGCL： towards extremely simple graph contrastive learning for recommendation［J］. IEEE Transactions on Knowledge and Data Engineering， 2024， 36（2）： 913-926.

[1]	党伟超, 温鑫瑜, 高改梅, 刘春霞. 基于多视图多尺度对比学习的图协同过滤[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1061-1068.
[2]	田仁杰, 景明利, 焦龙, 王飞. 基于混合负采样的图对比学习推荐算法[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1053-1060.
[3]	王聪, 史艳翠. 基于多视角学习的图神经网络群组推荐模型[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1205-1212.
[4]	马汉达, 吴亚东. 多域时空层次图神经网络的空气质量预测[J]. 《计算机应用》唯一官方网站, 2025, 45(2): 444-452.
[5]	蔡启健, 谭伟. 语义图增强的多模态推荐算法[J]. 《计算机应用》唯一官方网站, 2025, 45(2): 421-427.
[6]	余肖生, 王智鑫. 基于多层次图对比学习的序列推荐模型[J]. 《计算机应用》唯一官方网站, 2025, 45(1): 106-114.
[7]	程子栋, 李鹏, 朱枫. 物联网威胁情报知识图谱中潜在关系的挖掘[J]. 《计算机应用》唯一官方网站, 2025, 45(1): 24-31.
[8]	赵文博, 马紫彤, 杨哲. 基于有向超图自适应卷积的链接预测模型[J]. 《计算机应用》唯一官方网站, 2025, 45(1): 15-23.
[9]	杜郁, 朱焱. 构建预训练动态图神经网络预测学术合作行为消失[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2726-2731.
[10]	杨兴耀, 陈羽, 于炯, 张祖莲, 陈嘉颖, 王东晓. 结合自我特征和对比学习的推荐模型[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2704-2710.
[11]	杨航, 李汪根, 张根生, 王志格, 开新. 基于图神经网络的多层信息交互融合算法用于会话推荐[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2719-2725.
[12]	唐廷杰, 黄佳进, 秦进. 基于图辅助学习的会话推荐[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2711-2718.
[13]	杨莹, 郝晓燕, 于丹, 马垚, 陈永乐. 面向图神经网络模型提取攻击的图数据生成方法[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2483-2492.
[14]	杨帆, 邹窈, 朱明志, 马振伟, 程大伟, 蒋昌俊. 基于图注意力Transformer神经网络的信用卡欺诈检测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2634-2642.
[15]	林欣蕊, 王晓菲, 朱焱. 基于局部扩展社区发现的学术异常引用群体检测[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1855-1861.

基于协作贡献网络的开源项目开发者推荐

Developer recommendation for open-source projects based on collaborative contribution network

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 22

参考文献 40

相关文章 15

编辑推荐

Metrics