Journal of Computer Applications ›› 2022, Vol. 42 ›› Issue (10): 3170-3176.DOI: 10.11772/j.issn.1001-9081.2021081548
• Computer software technology • Previous Articles Next Articles
Jingyun CHENG, Buhong WANG, Peng LUO
Received:
2021-08-31
Revised:
2021-11-20
Accepted:
2021-11-21
Online:
2022-01-07
Published:
2022-10-10
Contact:
Jingyun CHENG
About author:
CHENG Jingyun, born in 1998, M. S. candidate. His research interests include information safety.程靖云, 王布宏, 罗鹏
通讯作者:
程靖云
作者简介:
第一联系人:程靖云(1998—),男,重庆人,硕士研究生,主要研究方向:信息安全; 1508458583@qq.comCLC Number:
Jingyun CHENG, Buhong WANG, Peng LUO. Static code defect detection method based on deep semantic fusion[J]. Journal of Computer Applications, 2022, 42(10): 3170-3176.
程靖云, 王布宏, 罗鹏. 基于深度语义融合的代码缺陷静态检测方法[J]. 《计算机应用》唯一官方网站, 2022, 42(10): 3170-3176.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2021081548
变量 | 后向 | 前向 |
---|---|---|
argc@main | {14} | {6,8,9,10,11,12, 17,19,20,21,22} |
argv@main | {14} | {9,19} |
buf@test | {6,8,14,17,20} | {8} |
str@test | {6,14,17,19,20} | {9} |
userstr@main | {14,17,19} | {19} |
Tab. 1 Static slice table
变量 | 后向 | 前向 |
---|---|---|
argc@main | {14} | {6,8,9,10,11,12, 17,19,20,21,22} |
argv@main | {14} | {9,19} |
buf@test | {6,8,14,17,20} | {8} |
str@test | {6,14,17,19,20} | {9} |
userstr@main | {14,17,19} | {19} |
参数名 | 值 | 参数名 | 值 |
---|---|---|---|
滤波器数量(N) | 128 | 迭代轮次 | 20 |
卷积窗口大小(m) | 1、3、5 | 激活函数 | ReLU |
GRU神经元个数(u) | 50 | 卷积方式 | MaxPooling1D |
全连接层神经元个数 | 484 | 优化函数 | Adamax |
Dropout | 0.5 | 损失函数 | categorical_crossentrop |
Batch Size | 256 |
Tab. 3 Experimental parameters
参数名 | 值 | 参数名 | 值 |
---|---|---|---|
滤波器数量(N) | 128 | 迭代轮次 | 20 |
卷积窗口大小(m) | 1、3、5 | 激活函数 | ReLU |
GRU神经元个数(u) | 50 | 卷积方式 | MaxPooling1D |
全连接层神经元个数 | 484 | 优化函数 | Adamax |
Dropout | 0.5 | 损失函数 | categorical_crossentrop |
Batch Size | 256 |
缺陷类型 | F1 | Acc | Rec | Pre |
---|---|---|---|---|
缓冲区溢出 | 86.59 | 89.04 | 89.29 | 84.05 |
格式化字符串 | 89.86 | 91.88 | 91.21 | 88.55 |
内存管理 | 88.06 | 90.79 | 91.74 | 84.67 |
错误处理不当 | 89.95 | 93.57 | 93.14 | 86.98 |
命令执行 | 95.49 | 94.38 | 97.73 | 93.35 |
混合 | 96.13 | 96.59 | 97.69 | 94.62 |
Tab. 4 Experimental results comparison of the proposed method for different defects
缺陷类型 | F1 | Acc | Rec | Pre |
---|---|---|---|---|
缓冲区溢出 | 86.59 | 89.04 | 89.29 | 84.05 |
格式化字符串 | 89.86 | 91.88 | 91.21 | 88.55 |
内存管理 | 88.06 | 90.79 | 91.74 | 84.67 |
错误处理不当 | 89.95 | 93.57 | 93.14 | 86.98 |
命令执行 | 95.49 | 94.38 | 97.73 | 93.35 |
混合 | 96.13 | 96.59 | 97.69 | 94.62 |
切片类型 | 耗时/s | Token数 | 复用比/% | F1/% | Acc/% |
---|---|---|---|---|---|
IFDS_Bo | 757 | 378 | 72.88 | 89.64 | 91.94 |
IFDS_Bw | 634 | 233 | 80.03 | 89.51 | 92.08 |
IFDS_Fw | 671 | 351 | 93.91 | 43.67 | 65.40 |
SDG_Bo | 737 | 378 | 73.00 | 89.50 | 91.92 |
SDG_Bw | 652 | 233 | 80.19 | 89.36 | 92.01 |
SDG_Fw | 695 | 351 | 93.91 | 45.80 | 65.04 |
Weiser_Bw | 949 | 257 | 84.01 | 54.11 | 69.59 |
Tab. 5 Experimental results comparison of different slicing methods
切片类型 | 耗时/s | Token数 | 复用比/% | F1/% | Acc/% |
---|---|---|---|---|---|
IFDS_Bo | 757 | 378 | 72.88 | 89.64 | 91.94 |
IFDS_Bw | 634 | 233 | 80.03 | 89.51 | 92.08 |
IFDS_Fw | 671 | 351 | 93.91 | 43.67 | 65.40 |
SDG_Bo | 737 | 378 | 73.00 | 89.50 | 91.92 |
SDG_Bw | 652 | 233 | 80.19 | 89.36 | 92.01 |
SDG_Fw | 695 | 351 | 93.91 | 45.80 | 65.04 |
Weiser_Bw | 949 | 257 | 84.01 | 54.11 | 69.59 |
检测方法 | 模型 | 关键点类别 | Token数 | 嵌入方法 | F1/% | Acc/% | 每批平均训练时间/ms | 每个平均检测时间/ms |
---|---|---|---|---|---|---|---|---|
基于规则 | Flawfinder[ | ― | ― | ― | 38.01 | 58.65 | ― | 0.126 |
基于深度 学习 | DCnnGRU[ | API | 363 | Skip-gram | 85.89 | 88.38 | 54.77 | 0.129 |
TextCNN+SVM[ | API | 363 | CBOW | 89.90 | 92.34 | 26.76+2 227.20 | 2.446 | |
BiGRU[ | API | 363 | FastText | 88.53 | 91.09 | 35.34 | 0.106 | |
本文模型 | API | 363 | Skip-gram | 89.69 | 92.15 | 62.11 | 0.167 | |
本文方法 | 本文模型 | 变量 | 378 | Skip-gram | 89.74 | 91.98 | 61.81 | 0.189 |
DCnnGRU | 变量 | 378 | Skip-gram | 87.68 | 89.31 | 55.68 | 0.139 | |
TextCNN+SVM | 变量 | 378 | Skip-gram | 89.35 | 92.17 | 27.06+2 059.67 | 2.509 | |
BiGRU | 变量 | 378 | Skip-gram | 88.79 | 91.16 | 35.87 | 0.117 |
Tab. 6 Experimental results comparison of different methods
检测方法 | 模型 | 关键点类别 | Token数 | 嵌入方法 | F1/% | Acc/% | 每批平均训练时间/ms | 每个平均检测时间/ms |
---|---|---|---|---|---|---|---|---|
基于规则 | Flawfinder[ | ― | ― | ― | 38.01 | 58.65 | ― | 0.126 |
基于深度 学习 | DCnnGRU[ | API | 363 | Skip-gram | 85.89 | 88.38 | 54.77 | 0.129 |
TextCNN+SVM[ | API | 363 | CBOW | 89.90 | 92.34 | 26.76+2 227.20 | 2.446 | |
BiGRU[ | API | 363 | FastText | 88.53 | 91.09 | 35.34 | 0.106 | |
本文模型 | API | 363 | Skip-gram | 89.69 | 92.15 | 62.11 | 0.167 | |
本文方法 | 本文模型 | 变量 | 378 | Skip-gram | 89.74 | 91.98 | 61.81 | 0.189 |
DCnnGRU | 变量 | 378 | Skip-gram | 87.68 | 89.31 | 55.68 | 0.139 | |
TextCNN+SVM | 变量 | 378 | Skip-gram | 89.35 | 92.17 | 27.06+2 059.67 | 2.509 | |
BiGRU | 变量 | 378 | Skip-gram | 88.79 | 91.16 | 35.87 | 0.117 |
1 | ABU-DABASEH F, ALSHAMMARI E. Automated penetration testing: an overview[C]// Proceedings of the 4th International Conference on Natural Language Computing. Chennai, Tamil Nadu: AIRCC Publishing Corporation, 2018: 121-129. 10.5121/csit.2018.80610 |
2 | 李韵,黄辰林,王中锋,等. 基于机器学习的软件漏洞挖掘方法综述[J]. 软件学报, 2020, 31(7):2040-2061. |
LI Y, HUANG C L, WANG Z F, et al. Survey of software vulnerability mining methods based on machine learning[J]. Journal of Software, 2020, 31(7):2040-2061. | |
3 | SEMASABA A O A, ZHENG W, WU X X, et al. Literature survey of deep learning-based vulnerability analysis on source code[J]. IET Software, 2020, 14(6): 654-664. 10.1049/iet-sen.2020.0084 |
4 | Details CVE. Browse vulnerabilities by date[EB/OL]. [2021-07-24].. |
5 | YAMAGUCHI F. Pattern-based methods for vulnerability discovery[J]. it—Information Technology, 2017, 59(2): 101-106. 10.1515/itit-2016-0037 |
6 | 蒋考林,白玮,张磊,等. 基于多通道图像深度学习的恶意代码检测[J]. 计算机应用, 2021, 41(4):1142-1147. |
JIANG K L, BAI W, ZHANG L, et al. Malicious code detection based on multi-channel image deep learning[J]. Journal of Computer Applications, 2021, 41(4):1142-1147. | |
7 | KIM S, WOO S, LEE H, et al. VUDDY: a scalable approach for vulnerable code clone discovery[C]// Proceedings of the 2017 IEEE Symposium on Security and Privacy. Piscataway: IEEE, 2017:595-614. 10.1109/sp.2017.62 |
8 | GRIECO G, GRINBLAT G L, UZAL L, et al. Toward large-scale vulnerability discovery using machine learning[C]// Proceedings of the 6th ACM Conference on Data and Application Security and Privacy. New York: ACM, 2016: 85-96. 10.1145/2857705.2857720 |
9 | SCANDARIATO R, WALDEN J, HOVSEPYAN A, et al. Predicting vulnerable software components via text mining[J]. IEEE Transactions on Software Engineering, 2014, 40(10): 993-1006. 10.1109/tse.2014.2340398 |
10 | MIRSKY Y, DEMONTIS A, KOTAK J, et al. The threat of offensive AI to organizations[EB/OL]. (2021-06-30) [2021-07-26].. |
11 | RUSSELL R, KIM L, HAMILTON L, et al. Automated vulnerability detection in source code using deep representation learning[C]// Proceedings of the 2018 17th IEEE International Conference on Machine Learning and Applications. Piscataway: IEEE, 2018: 757-762. 10.1109/icmla.2018.00120 |
12 | ZHOU Y Q, LIU S Q, SIOW J, et al. Devign: effective vulnerability identification by learning comprehensive program semantics via graph neural networks[C/OL]// Proceedings of the 33rd Conference on Neural Information Processing Systems. [2021-07-27].. |
13 | 许健,陈平华,熊建斌. 融合滑动窗口和哈希函数的代码漏洞检测模型[J]. 计算机应用研究, 2021, 38(8):2394-2400. |
XU J, CHEN P H, XIONG J B. Code vulnerability detection model based on sliding window and hash function[J]. Application Research of Computers, 2021, 38(8):2394-2400. | |
14 | LI Z, ZOU D Q, XU S H, et al. VulDeePecker: a deep learning-based system for vulnerability detection[EB/OL]. (2018-01-05) [2021-07-27].. 10.14722/ndss.2018.23158 |
15 | 李元诚,崔亚奇,吕俊峰,等. 开源软件漏洞检测的混合深度学习方法[J]. 计算机工程与应用, 2019, 55(11):52-59. |
LI Y C, CUI Y Q, LYU J F, et al. Combined deep learning method for open source software vulnerability detection[J]. Computer Engineering and Applications, 2019, 55(11):52-59. | |
16 | 王晓萌,管志斌,辛伟,等. 基于深度卷积神经网络的源代码缺陷检测方法[J]. 清华大学学报(自然科学版), 2021, 61(11): 1267-1272. |
WANG X M, GUAN Z B, XIN W, et al. Source code defect detection using deep convolutional neural networks[J]. Journal of Tsinghua University (Science and Technology), 2021, 61(11): 1267-1272. | |
17 | LI X, WANG L, XIN Y, et al. Automated vulnerability detection in source code using minimum intermediate representation learning[J]. Applied Sciences, 2020, 10(5): No.1692. 10.3390/app10051692 |
18 | JEON S, KIM H K. AutoVAS: an automated vulnerability analysis system with a deep learning approach[J]. Computers and Security, 2021, 106: No.102308. 10.1016/j.cose.2021.102308 |
19 | CHANDRA A, SINGHAL A, BANSAL A. A study of program slicing techniques for software development approaches[C]// Proceedings of the 1st International Conference on Next Generation Computing Technologies. Piscataway: IEEE, 2015: 622-627. 10.1109/ngct.2015.7375196 |
20 | MIKOLOV T, CHEN K, CORRADO G, et al. Efficient estimation of word representations in vector space[EB/OL]. (2013-09-07) [2021-07-29].. 10.3126/jiee.v3i1.34327 |
21 | KIM Y. Convolutional neural networks for sentence classification[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA: Association for Computational Linguistics, 2014:1746-1751. 10.3115/v1/d14-1181 |
22 | National Institute of Standards and Technology. Software assurance reference dataset[DS/OL]. [2021-08-02].. 10.1109/dasc.2007.4391957 |
23 | WHEELER D A. Flawfinder[EB/OL]. [2017-8-26].. |
[1] | Yexin PAN, Zhe YANG. Optimization model for small object detection based on multi-level feature bidirectional fusion [J]. Journal of Computer Applications, 2024, 44(9): 2871-2877. |
[2] | Yunchuan HUANG, Yongquan JIANG, Juntao HUANG, Yan YANG. Molecular toxicity prediction based on meta graph isomorphism network [J]. Journal of Computer Applications, 2024, 44(9): 2964-2969. |
[3] | Shunyong LI, Shiyi LI, Rui XU, Xingwang ZHAO. Incomplete multi-view clustering algorithm based on self-attention fusion [J]. Journal of Computer Applications, 2024, 44(9): 2696-2703. |
[4] | Jing QIN, Zhiguang QIN, Fali LI, Yueheng PENG. Diagnosis of major depressive disorder based on probabilistic sparse self-attention neural network [J]. Journal of Computer Applications, 2024, 44(9): 2970-2974. |
[5] | Xiyuan WANG, Zhancheng ZHANG, Shaokang XU, Baocheng ZHANG, Xiaoqing LUO, Fuyuan HU. Unsupervised cross-domain transfer network for 3D/2D registration in surgical navigation [J]. Journal of Computer Applications, 2024, 44(9): 2911-2918. |
[6] | Yuhan LIU, Genlin JI, Hongping ZHANG. Video pedestrian anomaly detection method based on skeleton graph and mixed attention [J]. Journal of Computer Applications, 2024, 44(8): 2551-2557. |
[7] | Kaili DENG, Weibo WEI, Zhenkuan PAN. Industrial defect detection method with improved masked autoencoder [J]. Journal of Computer Applications, 2024, 44(8): 2595-2603. |
[8] | Yanjie GU, Yingjun ZHANG, Xiaoqian LIU, Wei ZHOU, Wei SUN. Traffic flow forecasting via spatial-temporal multi-graph fusion [J]. Journal of Computer Applications, 2024, 44(8): 2618-2625. |
[9] | Qianhong SHI, Yan YANG, Yongquan JIANG, Xiaocao OUYANG, Wubo FAN, Qiang CHEN, Tao JIANG, Yuan LI. Multi-granularity abrupt change fitting network for air quality prediction [J]. Journal of Computer Applications, 2024, 44(8): 2643-2650. |
[10] | Yiqun ZHAO, Zhiyu ZHANG, Xue DONG. Anisotropic travel time computation method based on dense residual connection physical information neural networks [J]. Journal of Computer Applications, 2024, 44(7): 2310-2318. |
[11] | Song XU, Wenbo ZHANG, Yifan WANG. Lightweight video salient object detection network based on spatiotemporal information [J]. Journal of Computer Applications, 2024, 44(7): 2192-2199. |
[12] | Ruihua LIU, Zihe HAO, Yangyang ZOU. Gait recognition algorithm based on multi-layer refined feature fusion [J]. Journal of Computer Applications, 2024, 44(7): 2250-2257. |
[13] | Xun SUN, Ruifeng FENG, Yanru CHEN. Monocular 3D object detection method integrating depth and instance segmentation [J]. Journal of Computer Applications, 2024, 44(7): 2208-2215. |
[14] | Zheng WU, Zhiyou CHENG, Zhentian WANG, Chuanjian WANG, Sheng WANG, Hui XU. Deep learning-based classification of head movement amplitude during patient anaesthesia resuscitation [J]. Journal of Computer Applications, 2024, 44(7): 2258-2263. |
[15] | Huanhuan LI, Tianqiang HUANG, Xuemei DING, Haifeng LUO, Liqing HUANG. Public traffic demand prediction based on multi-scale spatial-temporal graph convolutional network [J]. Journal of Computer Applications, 2024, 44(7): 2065-2072. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||