《计算机应用》唯一官方网站 ›› 2026, Vol. 46 ›› Issue (2): 518-527.DOI: 10.11772/j.issn.1001-9081.2025020217
• 计算机软件技术 • 上一篇
于巧1, 黄子睿1(
), 程圣懿1(
), 祝义1, 张淑涛2
收稿日期:2025-03-06
修回日期:2025-05-11
接受日期:2025-05-13
发布日期:2025-05-16
出版日期:2026-02-10
通讯作者:
黄子睿
作者简介:于巧(1989—),女,山东莱阳人,副教授,博士,CCF会员,主要研究方向:机器学习、软件缺陷预测、漏洞检测基金资助:
Qiao YU1, Zirui HUANG1(
), Shengyi CHENG1(
), Yi ZHU1, Shutao ZHANG2
Received:2025-03-06
Revised:2025-05-11
Accepted:2025-05-13
Online:2025-05-16
Published:2026-02-10
Contact:
Zirui HUANG
About author:YU Qiao, born in 1989, Ph. D., associate professor. Her research interests include machine learning, software defect prediction, vulnerability detection.Supported by:摘要:
随着软件在各个领域的广泛应用,软件漏洞呈不断增长的趋势,基于深度学习的软件漏洞检测方法得到广泛应用;然而,现有的图表示学习方法通常忽略了图中边对软件漏洞检测的影响,并且对边权重的表示过于粗糙。针对该问题,提出一种基于边权重的软件漏洞检测方法EWVD(Edge Weight for Vulnerability Detection)。首先,对源代码中的注释、自定义变量名和函数名进行清理和抽象表示;其次,经过对比分析后选择使用Sent2Vec进行嵌入表示;再次,利用连接结构、邻居节点的重要性和Jaccard相似性这3种度量方式,综合计算边权重,从而识别节点间的信息传递能力;最后,利用边权重提升模型对漏洞语句潜在关系的感知能力,从而判断图中边的重要性。实验结果表明,与7种漏洞检测基线方法中的最优基线VulCNN相比,EWVD的准确率提高了1.06个百分点,而假阳性率(FPR)降低了1.11个百分点。可见,EWVD细化了边权重的表示,并且提升了漏洞检测的综合性能。
中图分类号:
于巧, 黄子睿, 程圣懿, 祝义, 张淑涛. 基于边权重的软件漏洞检测方法[J]. 计算机应用, 2026, 46(2): 518-527.
Qiao YU, Zirui HUANG, Shengyi CHENG, Yi ZHU, Shutao ZHANG. Software vulnerability detection method based on edge weight[J]. Journal of Computer Applications, 2026, 46(2): 518-527.
| 数据集 | 漏洞函数 | 非漏洞函数 |
|---|---|---|
| NVD | 1 384 | 5 913 |
| SARD | 12 303 | 21 057 |
表1 数据集中的样本数
Tab. 1 The number of samples in datasets
| 数据集 | 漏洞函数 | 非漏洞函数 |
|---|---|---|
| NVD | 1 384 | 5 913 |
| SARD | 12 303 | 21 057 |
| 方法 | FPR | FNR | 准确率 |
|---|---|---|---|
| RATS | 27.91 | 64.43 | 59.17 |
| FlawFinder | 54.67 | 47.62 | 59.08 |
| VulDeePecker | 25.37 | 14.95 | 76.68 |
| VGDetector | 26.14 | 16.62 | 78.53 |
| VulDeeLocator | 21.71 | 14.36 | 80.72 |
| Devign | 24.76 | 13.37 | 83.06 |
| VulCNN | 18.27 | 17.47 | 83.11 |
| EWVD | 17.16 | 16.43 | 84.17 |
表2 7种基线方法和EWVD的漏洞检测结果 (%)
Tab. 2 Vulnerability detection results of seven baseline methods and EWVD
| 方法 | FPR | FNR | 准确率 |
|---|---|---|---|
| RATS | 27.91 | 64.43 | 59.17 |
| FlawFinder | 54.67 | 47.62 | 59.08 |
| VulDeePecker | 25.37 | 14.95 | 76.68 |
| VGDetector | 26.14 | 16.62 | 78.53 |
| VulDeeLocator | 21.71 | 14.36 | 80.72 |
| Devign | 24.76 | 13.37 | 83.06 |
| VulCNN | 18.27 | 17.47 | 83.11 |
| EWVD | 17.16 | 16.43 | 84.17 |
| 嵌入方法 | FPR | FNR | Accuracy |
|---|---|---|---|
| struc2vec | 29.26 | 29.10 | 72.69 |
| node2vec | 27.15 | 26.28 | 74.52 |
| Sent2Vec | 17.16 | 16.43 | 84.17 |
表3 嵌入方法的漏洞检测性能对比 (%)
Tab. 3 Comparison of vulnerability detection performance of embedding methods
| 嵌入方法 | FPR | FNR | Accuracy |
|---|---|---|---|
| struc2vec | 29.26 | 29.10 | 72.69 |
| node2vec | 27.15 | 26.28 | 74.52 |
| Sent2Vec | 17.16 | 16.43 | 84.17 |
| 方法 | FPR | FNR | Accuracy |
|---|---|---|---|
| RATS | -93.39 | -228.26 | 164.25 |
| FlawFinder | -195.85 | -253.15 | 141.46 |
| VulDeePecker | -44.46 | 8.03 | 41.81 |
| VGDetector | -61.94 | 0.58 | 44.40 |
| VulDeeLocator | -19.08 | 15.85 | 17.28 |
| Devign | -44.79 | 20.16 | 5.58 |
| VulCNN | -7.40 | -4.47 | 3.17 |
表4 t统计量的显著性分析
Tab. 4 Significance analysis of t-statistic
| 方法 | FPR | FNR | Accuracy |
|---|---|---|---|
| RATS | -93.39 | -228.26 | 164.25 |
| FlawFinder | -195.85 | -253.15 | 141.46 |
| VulDeePecker | -44.46 | 8.03 | 41.81 |
| VGDetector | -61.94 | 0.58 | 44.40 |
| VulDeeLocator | -19.08 | 15.85 | 17.28 |
| Devign | -44.79 | 20.16 | 5.58 |
| VulCNN | -7.40 | -4.47 | 3.17 |
| 方法 | FPR | FNR | Accuracy |
|---|---|---|---|
| RATS | 9.38×10-15 | 3.03×10-18 | 5.85×10-17 |
| FlawFinder | 1.20×10-17 | 1.19×10-18 | 2.24×10-16 |
| VulDeePecker | 7.37×10-12 | 2.15×10-5 | 1.28×10-11 |
| VGDetector | 3.76×10-13 | 0.576 | 7.45×10-12 |
| VulDeeLocator | 1.37×10-8 | 6.98×10-8 | 3.29×10-8 |
| Devign | 6.89×10-12 | 8.46×10-9 | 0.000 342 |
| VulCNN | 0.000 413 | 0.001 55 | 0.011 4 |
表5 p值的显著性分析
Tab. 5 Significance analysis of p?value
| 方法 | FPR | FNR | Accuracy |
|---|---|---|---|
| RATS | 9.38×10-15 | 3.03×10-18 | 5.85×10-17 |
| FlawFinder | 1.20×10-17 | 1.19×10-18 | 2.24×10-16 |
| VulDeePecker | 7.37×10-12 | 2.15×10-5 | 1.28×10-11 |
| VGDetector | 3.76×10-13 | 0.576 | 7.45×10-12 |
| VulDeeLocator | 1.37×10-8 | 6.98×10-8 | 3.29×10-8 |
| Devign | 6.89×10-12 | 8.46×10-9 | 0.000 342 |
| VulCNN | 0.000 413 | 0.001 55 | 0.011 4 |
| 消融方式 | FPR | FNR | Accuracy |
|---|---|---|---|
| EWVDwu_EW | 19.47 | 17.46 | 82.67 |
| EWVDTP | 17.49 | 16.51 | 83.59 |
| EWVDNIP | 17.71 | 16.64 | 83.59 |
| EWVDJac | 17.26 | 16.63 | 83.61 |
| EWVDTP_NIP | 17.34 | 16.80 | 83.85 |
| EWVDTP_Jac | 17.49 | 16.49 | 83.76 |
| EWVDNIP_Jac | 17.67 | 16.76 | 83.60 |
| EWVD | 17.16 | 16.43 | 84.17 |
表6 消融实验结果 (%)
Tab. 6 Ablation experiment results
| 消融方式 | FPR | FNR | Accuracy |
|---|---|---|---|
| EWVDwu_EW | 19.47 | 17.46 | 82.67 |
| EWVDTP | 17.49 | 16.51 | 83.59 |
| EWVDNIP | 17.71 | 16.64 | 83.59 |
| EWVDJac | 17.26 | 16.63 | 83.61 |
| EWVDTP_NIP | 17.34 | 16.80 | 83.85 |
| EWVDTP_Jac | 17.49 | 16.49 | 83.76 |
| EWVDNIP_Jac | 17.67 | 16.76 | 83.60 |
| EWVD | 17.16 | 16.43 | 84.17 |
| 方法 | 运行时间 | 方法 | 运行时间 |
|---|---|---|---|
| VulDeePecker | 7.8 | Devign | 12.6 |
| VGDetector | 6.4 | VulCNN | 1.9 |
| VulDeeLocator | 30.7 | EWVD | 2.2 |
表7 基于深度学习的基线方法与EWVD的平均运行时间开销 (s)
Tab. 7 Average running time overhead of deep learning-based baseline methods and EWVD
| 方法 | 运行时间 | 方法 | 运行时间 |
|---|---|---|---|
| VulDeePecker | 7.8 | Devign | 12.6 |
| VGDetector | 6.4 | VulCNN | 1.9 |
| VulDeeLocator | 30.7 | EWVD | 2.2 |
| [1] | STEENHOEK B, GAO H, LE W. Dataflow analysis-inspired deep learning for efficient vulnerability detection[C]// Proceedings of the IEEE/ACM 46th International Conference on Software Engineering. New York: ACM, 2024: No.16. |
| [2] | Secure Software Inc. Rough Audit Tool for Security (RATS)[EB/OL]. [2024-11-07].. |
| [3] | Ltd Checkmarx. Checkmarx[EB/OL]. [2024-11-07].. |
| [4] | WHEELER D A. FlawFinder[EB/OL]. [2024-11-07].. |
| [5] | KIM S, WOO S, LEE H, et al. VUDDY: a scalable approach for vulnerable code clone discovery[C]// Proceedings of the 2017 IEEE Symposium on Security and Privacy. Piscataway: IEEE, 2017: 595-614. |
| [6] | YAMAGUCHI F, LOTTMANN M, RIECK K. Generalized vulnerability extrapolation using abstract syntax trees[C]// Proceedings of the 28th Annual Computer Security Applications Conference. New York: ACM, 2012: 359-368. |
| [7] | PHAM N H, NGUYEN T T, NGUYEN H A, et al. Detection of recurring software vulnerabilities[C]// Proceedings of the 25th IEEE/ACM International Conference on Automated Software Engineering. New York: ACM, 2010: 447-456. |
| [8] | 李韵,黄辰林,王中锋,等. 基于机器学习的软件漏洞挖掘方法综述[J]. 软件学报, 2020, 31(7): 2040-2061. |
| LI Y, HUANG C L, WANG Z F, et al. Survey of software vulnerability mining methods based on machine learning[J]. Journal of Software, 2020, 31(7): 2040-2061. | |
| [9] | ALLAMANIS M, BROCKSCHMIDT M, KHADEMI M. Learning to represent programs with graphs[EB/OL]. [2024-11-07].. |
| [10] | NANDI S, MALTA M C, MAJI G, et al. IS-PEW: identifying influential spreaders using potential edge weight in complex networks[C]// Proceedings of the 2023 International Conference on Complex Networks and Their Applications, SCI 1143. Cham: Springer, 2024: 309-320. |
| [11] | MA X, MA Y. The local triangle structure centrality method to rank nodes in networks[J]. Complexity, 2019, 2019: No.9057194. |
| [12] | COSTA L D F. Further generalizations of the Jaccard index[EB/OL]. [2024-11-07].. |
| [13] | LI Z, ZOU D, XU S, et al. VulDeePecker: a deep learning-based system for vulnerability detection[EB/OL]. [2024-11-07].. |
| [14] | REN Z, JU X, CHEN X, et al. ProRLearn: boosting prompt tuning-based vulnerability detection by reinforcement learning[J]. Automated Software Engineering, 2024, 31(2): No.38. |
| [15] | 李妍,羌卫中,李珍,等. 基于程序过程间语义优化的深度学习漏洞检测方法[J]. 网络与信息安全学报, 2023, 9(6): 86-101. |
| LI Y, QIANG W Z, LI Z, et al. Deep learning vulnerability detection method based on optimized inter-procedural semantics of programs[J]. Chinese Journal of Network and Information Security, 2023, 9(6): 86-101. | |
| [16] | TANG M, TANG W, GUI Q, et al. A vulnerability detection algorithm based on Residual Graph Attention Networks for source code imbalance (RGAN)[J]. Expert Systems with Applications, 2024, 238(Pt D): No.122216. |
| [17] | 胡雨涛,王溯远,吴月明,等. 基于图神经网络的切片级漏洞检测及解释方法[J]. 软件学报, 2023, 34(6): 2543-2561. |
| HU Y T, WANG S Y, WU Y M, et al. Slice-level vulnerability detection and interpretation method based on graph neural network[J]. Journal of Software, 2023, 34(6): 2543-2561. | |
| [18] | CUI L, HAO Z, JIAO Y, et al. VulDetector: detecting vulnerabilities using weighted feature graph comparison[J]. IEEE Transactions on Information Forensics and Security, 2021, 16: 2004-2017. |
| [19] | LIU H, JIANG S, QI X, et al. Detect software vulnerabilities with weight biases via graph neural networks[J]. Expert Systems with Applications, 2024, 238(Pt B): No.121764. |
| [20] | WU Y, ZOU D, DOU S, et al. VulCNN: an image-inspired scalable vulnerability detection system[C]// Proceedings of the ACM/IEEE 44th International Conference on Software Engineering. New York: ACM, 2022: 2365-2376. |
| [21] | YAMAGUCHI F, GOLDE N, ARP D, et al. Modeling and discovering vulnerabilities with code property graphs[C]// Proceedings of the 2014 IEEE Symposium on Security and Privacy. Piscataway: IEEE, 2014: 590-604. |
| [22] | RIBEIRO L F R, SAVERESE P H P, FIGUEIREDO D R. struc2vec: Learning node representations from structural identity[C]// Proceedings of the 23rd ACM SIGKDD Conference on Knowledge Discovery and Data Mining. New York: ACM, 2017: 385-394. |
| [23] | GROVER A, LESKOVEC J. node2vec: Scalable feature learning for networks[C]// Proceedings of the 22nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining. New York: ACM, 2016: 855-864. |
| [24] | PAGLIARDINI M, GUPTA P, JAGGI M. Unsupervised learning of sentence embeddings using compositional n-gram features[C]// Proceedings of the 2018 Annual Conference of the North of the American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Stroudsburg: ACL, 2018: 528-540. |
| [25] | ZHANG C, XIN Y. VulGAI: vulnerability detection based on graphs and images[J]. Computers and Security, 2023, 135: No.103501. |
| [26] | ZHANG J, LUO Y. Degree centrality, betweenness centrality, and closeness centrality in social network[C]// Proceedings of the 2nd International Conference on Modelling, Simulation and Applied Mathematics. Dordrecht: Atlantis Press, 2017: 300-303. |
| [27] | LIN G, XIAO W, ZHANG J, et al. Deep learning-based vulnerable function detection: a benchmark[C]// Proceedings of the 2019 International Conference on Information and Communications Security, LNCS 11999. Cham: Springer, 2020: 219-232. |
| [28] | CHENG X, WANG H, HUA J, et al. Static detection of control-flow-related vulnerabilities using graph embedding[C]// Proceedings of the 24th International Conference on Engineering of Complex Computer Systems. Piscataway: IEEE, 2019: 41-50. |
| [29] | LI Z, ZOU D, XU S, et al. VulDeeLocator: a deep learning-based fine-grained vulnerability detector[J]. IEEE Transactions on Dependable and Secure Computing, 2022, 19(4): 2821-2837. |
| [30] | ZHOU Y, LIU S, SIOW J, et al. Devign: effective vulnerability identification by learning comprehensive program semantics via graph neural networks[C]// Proceedings of the 33rd International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2019: 10197-10207. |
| [1] | 林怡, 夏冰, 王永, 孟顺达, 刘居宠, 张书钦. 基于AI智能体的隐藏RESTful API识别与漏洞检测方法[J]. 《计算机应用》唯一官方网站, 2026, 46(1): 135-143. |
| [2] | 李玟, 李开荣, 杨凯. 基于数据增强的子图感知对比学习[J]. 《计算机应用》唯一官方网站, 2026, 46(1): 1-9. |
| [3] | 王义, 马应龙. 基于项图动态适应性生成的多任务社交项推荐方法[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2592-2599. |
| [4] | 梁辰, 王奕森, 魏强, 杜江. 基于Transformer-GCN的源代码漏洞检测方法[J]. 《计算机应用》唯一官方网站, 2025, 45(7): 2296-2303. |
| [5] | 李浴淑, 邢颖, 陆思奇, 潘恒, 柴森春, 斯雪明. 基于深度学习的函数体切片级C/C++智能合约漏洞检测工具[J]. 《计算机应用》唯一官方网站, 2025, 45(11): 3493-3501. |
| [6] | 刘春霞, 徐晗颖, 高改梅, 党伟超, 李子路. 基于回声状态网络的智能合约漏洞检测方法[J]. 《计算机应用》唯一官方网站, 2025, 45(1): 153-161. |
| [7] | 杜郁, 朱焱. 构建预训练动态图神经网络预测学术合作行为消失[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2726-2731. |
| [8] | 黎施彬, 龚俊, 汤圣君. 基于Graph Transformer的半监督异配图表示学习模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1816-1823. |
| [9] | 童俊成, 赵波. 区块链智能合约漏洞检测与自动化修复综述[J]. 《计算机应用》唯一官方网站, 2023, 43(3): 785-793. |
| [10] | 文敏, 王荣存, 姜淑娟. 基于关系图卷积网络的源代码漏洞检测[J]. 《计算机应用》唯一官方网站, 2022, 42(6): 1814-1821. |
| [11] | 倪萍, 陈伟. 基于模糊测试的反射型跨站脚本漏洞检测[J]. 计算机应用, 2021, 41(9): 2594-2601. |
| [12] | 王洪达 邢建春 宋巍 杨启亮. 基于程序依赖图的静态BPEL程序切片技术[J]. 计算机应用, 2012, 32(08): 2338-2341. |
| [13] | 冯永 张洋. 基于概念间边权重的概念相似性计算方法[J]. 计算机应用, 2012, 32(01): 202-205. |
| [14] | 张龙杰 谢晓方 袁胜智 唐江. 基于目标代码的格式串漏洞检测模型研究[J]. 计算机应用, 2008, 28(10): 2495-2498. |
| 阅读次数 | ||||||
|
全文 |
|
|||||
|
摘要 |
|
|||||