《计算机应用》唯一官方网站 ›› 2025, Vol. 45 ›› Issue (4): 1213-1222.DOI: 10.11772/j.issn.1001-9081.2024040454

• 数据科学与技术 • 上一篇    下一篇

基于协作贡献网络的开源项目开发者推荐

游兰1,2, 张雨昂1, 刘源1,3, 陈智军1,2(), 王伟4, 曾星1,3, 何张玮1   

  1. 1.湖北大学 计算机学院,武汉 430062
    2.大数据智能分析与行业应用湖北省重点实验室(湖北大学),武汉 430062
    3.智能感知系统与安全教育部重点实验室(湖北大学),武汉 430062
    4.华东师范大学 数据科学与工程学院,上海 200062
  • 收稿日期:2024-04-16 修回日期:2024-11-06 接受日期:2024-11-07 发布日期:2025-04-08 出版日期:2025-04-10
  • 通讯作者: 陈智军
  • 作者简介:游兰(1978—),女,湖北武汉人,教授,博士,CCF会员,主要研究方向:开源数字生态学、时空大数据、数智孪生;
    张雨昂(1997—),男,湖北武汉人,硕士研究生,主要研究方向:图神经网络、推荐系统;
    刘源(2001—),男,湖北随州人,硕士研究生,主要研究方向:开源战略、软件工程;
    王伟(1979—),男,上海人,教授,博士,主要研究方向:开源战略、开源测量学、开源数字生态系统;
    曾星(1987—),男,湖北武汉人,讲师,博士,主要研究方向:时空大数据分析与挖掘、机器学习、人工智能;
    何张玮(1998—),男,湖北武汉人,硕士研究生,主要研究方向:开源战略、软件工程。
  • 基金资助:
    湖北省重点研发计划项目(2022BAA044)。

Developer recommendation for open-source projects based on collaborative contribution network

Lan YOU1,2, Yuang ZHANG1, Yuan LIU1,3, Zhijun CHEN1,2(), Wei WANG4, Xing ZENG1,3, Zhangwei HE1   

  1. 1.College of Computer Science,Hubei University,Wuhan Hubei 430062,China
    2.Hubei Key Laboratory of Big Data Intelligent Analysis and Application,Wuhan Hubei 430062,China
    3.Key Laboratory of Intelligent Sensing System and Security,Ministry of Education (Hubei University),Wuhan Hubei 430062,China
    4.School of Data Science and Engineering,East China Normal University,Shanghai 200062,China
  • Received:2024-04-16 Revised:2024-11-06 Accepted:2024-11-07 Online:2025-04-08 Published:2025-04-10
  • Contact: Zhijun CHEN
  • About author:YOU Lan, born in 1978, Ph. D., professor. Her research interests include open-source digital ecology, spatio-temporal big data, intelligent digital twin.
    ZHANG Yuang, born in 1997, M. S. candidate. His research interests include graph neural network, recommender system.
    LIU Yuan, born in 2001, M. S. candidate. His research interests include open-source strategy, software engineering.
    WANG Wei, born in 1979, Ph. D., professor. His research interests include open-source strategy, open-source surveying, open-source digital ecosystem.
    ZENG Xing, born in 1987, Ph. D., lecturer. His research interests include spatio‑temporal big data analysis and mining, machine learning, artificial intelligence.
    HE Zhangwei, born in 1998, M. S. candidate. His research interests include open-source strategy, software engineering.
  • Supported by:
    Key Research and Development Program of Hubei Province(2022BAA044)

摘要:

面向开源项目推荐开发人员对开源生态建设具有重要意义。区别于传统软件开发,开源领域的开发者、项目、组织及相互关系体现了开放式协作项目的特点,而它们蕴含的语义有助于精准推荐开源项目的开发者。因此,提出一种基于协作贡献网络(CCN)的开发者推荐(DRCCN)方法。首先,利用开源软件(OSS)开发者、OSS项目、OSS组织之间的贡献关系构建CCN;其次,基于CCN构建一个3层深度的异构GraphSAGE (Graph SAmple and aggreGatE)图神经网络(GNN)模型,预测开发者节点和开源项目节点之间的链接,从而产生相应的嵌入对;最后,根据预测结果,采用K最近邻(KNN)算法完成开发者推荐。在GitHub数据集上训练和测试模型的实验结果表明,相较于序列推荐的对比学习模型CL4SRec (Contrastive Learning for Sequential Recommendation),DRCCN在精确率、召回率和F1值这3个指标上分别提升了约10.7%、2.6%和4.2%。因此,所提模型可以为开源社区项目的开发者推荐提供重要的参考依据。

关键词: 开源生态, 开发者推荐, 异构信息网络, 图神经网络, 开源软件

Abstract:

Recommending developers for open-source projects is of great significance to the construction of open-source ecology. Different from traditional software development, developers, projects, organizations and correlations in the open-source field reflect the characteristics of open collaborative projects, and their embedded semantics help to recommend developers accurately for open-source projects. Therefore, a Developer Recommendation method based on Collaborative Contribution Network (DRCCN) was proposed. Firstly, a CCN was constructed by utilizing the contribution relationships among Open-Source Software (OSS) developers, OSS projects and OSS organizations. Then, based on CCN, a three-layer deep heterogeneous GraphSAGE (Graph SAmple and aggreGatE) Graph Neural Network (GNN) model was constructed to predict the links between developer nodes and open-source project nodes, so as to generate the corresponding embedding pairs. Finally, according to the prediction results, the K-Nearest Neighbor (KNN) algorithm was adopted to complete the developer recommendation. The proposed model was trained and tested on GitHub dataset, and the experimental results show that compared to the contrastive learning model for sequential recommendation CL4SRec (Contrastive Learning for Sequential Recommendation), DRCCN improves the precision, recall, and F1 score by approximately 10.7%, 2.6%, and 4.2%, respectively. It can be seen that the proposed model can provide important reference for the developer recommendation of open-source community projects.

Key words: open-source ecology, developer recommendation, heterogeneous information network, Graph Neural Network (GNN), Open-Source Software (OSS)

中图分类号: