Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (7): 2217-2225.DOI: 10.11772/j.issn.1001-9081.2022060931
Special Issue: 网络空间安全
• Cyber security • Previous Articles Next Articles
Bona XUAN, Jin LI(), Yafei SONG, Zexuan MA
Received:
2022-06-28
Revised:
2022-08-27
Accepted:
2022-09-05
Online:
2022-11-25
Published:
2023-07-10
Contact:
Jin LI
About author:
XUAN Bona, born in 1991, M. S. candidate. Her research interests include network security and defense, malware code classification.Supported by:
通讯作者:
李进
作者简介:
轩勃娜(1991—),女,陕西咸阳人,硕士研究生,主要研究方向:网络安全防御、恶意代码分类;基金资助:
CLC Number:
Bona XUAN, Jin LI, Yafei SONG, Zexuan MA. Malicious code classification method based on improved MobileNetV2[J]. Journal of Computer Applications, 2023, 43(7): 2217-2225.
轩勃娜, 李进, 宋亚飞, 马泽煊. 基于改进MobileNetV2的恶意代码分类方法[J]. 《计算机应用》唯一官方网站, 2023, 43(7): 2217-2225.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2022060931
输入长度 | 操作 | e | c | n | s |
---|---|---|---|---|---|
2 562×3 | Conv2d | — | 16 | 1 | 2 |
1 122×16 | Bottleneck | 1 | 16 | 1 | 2 |
562×16 | Bottleneck | 3 | 24 | 1 | 2 |
282×24 | Bottleneck | 3 | 24 | 2 | 1 |
142×40 | Bottleneck | 6 | 48 | 4 | 1 |
72×96 | Bottleneck | 6 | 96 | 2 | 2 |
72×96 | Conv2d,1×1 | 6 | 576 | 1 | 1 |
72×576 | Avgpool 7×7 | — | — | 1 | 1 |
72×576 | Conv2d,1×1 | — | 1 028 | 1 | 1 |
8×1×256 | Avgpool 8×8 | — | 1 028 | 1 | 1 |
8×1×256 | Avgpool 8×8 | — | 1 028 | 1 | 1 |
16×1×256 | Concate | — | 1 028 | 1 | 1 |
16×1×256 | Conv2d,1×1 | — | 1 028 | 1 | 1 |
16×1×40 | BatchNorm+Relu | — | 1 028 | 1 | 1 |
12×1 280 | Multiply,8×8 | — | 1 028 | 1 | 1 |
12×1 280 | Conv2d,1×1 | — | 1 028 | 1 | 1 |
12×1 280 | Dropout | — | 1 028 | 1 | 1 |
12×1 280 | Conv2d,1×1 | — | class | 1 | 1 |
Tab. 1 MobileNetV2 model parameters
输入长度 | 操作 | e | c | n | s |
---|---|---|---|---|---|
2 562×3 | Conv2d | — | 16 | 1 | 2 |
1 122×16 | Bottleneck | 1 | 16 | 1 | 2 |
562×16 | Bottleneck | 3 | 24 | 1 | 2 |
282×24 | Bottleneck | 3 | 24 | 2 | 1 |
142×40 | Bottleneck | 6 | 48 | 4 | 1 |
72×96 | Bottleneck | 6 | 96 | 2 | 2 |
72×96 | Conv2d,1×1 | 6 | 576 | 1 | 1 |
72×576 | Avgpool 7×7 | — | — | 1 | 1 |
72×576 | Conv2d,1×1 | — | 1 028 | 1 | 1 |
8×1×256 | Avgpool 8×8 | — | 1 028 | 1 | 1 |
8×1×256 | Avgpool 8×8 | — | 1 028 | 1 | 1 |
16×1×256 | Concate | — | 1 028 | 1 | 1 |
16×1×256 | Conv2d,1×1 | — | 1 028 | 1 | 1 |
16×1×40 | BatchNorm+Relu | — | 1 028 | 1 | 1 |
12×1 280 | Multiply,8×8 | — | 1 028 | 1 | 1 |
12×1 280 | Conv2d,1×1 | — | 1 028 | 1 | 1 |
12×1 280 | Dropout | — | 1 028 | 1 | 1 |
12×1 280 | Conv2d,1×1 | — | class | 1 | 1 |
样本大小/KB | 宽度/像素 |
---|---|
(0,6] | 32 |
(6, 20] | 64 |
(20, 46] | 128 |
(46, 80] | 256 |
(80,160] | 384 |
(160, 800] | 512 |
(800, 1 200] | 1 024 |
(1 200,76 900] | 2 048 |
Tab. 2 Mapping of binary file and grayscale image width
样本大小/KB | 宽度/像素 |
---|---|
(0,6] | 32 |
(6, 20] | 64 |
(20, 46] | 128 |
(46, 80] | 256 |
(80,160] | 384 |
(160, 800] | 512 |
(800, 1 200] | 1 024 |
(1 200,76 900] | 2 048 |
图片尺寸/像素 | 准确率/% | 召回率/% | 精确率/% | F1/% |
---|---|---|---|---|
32×32 | 93.14 | 92.49 | 93.00 | 93.36 |
64×64 | 93.68 | 93.01 | 93.13 | 93.08 |
128×128 | 94.84 | 94.01 | 94.09 | 94.28 |
256×256 | 96.98 | 96.94 | 95.88 | 96.69 |
Tab. 3 Comparison of evaluation indicators of image samples with different sizes from DataCon dataset
图片尺寸/像素 | 准确率/% | 召回率/% | 精确率/% | F1/% |
---|---|---|---|---|
32×32 | 93.14 | 92.49 | 93.00 | 93.36 |
64×64 | 93.68 | 93.01 | 93.13 | 93.08 |
128×128 | 94.84 | 94.01 | 94.09 | 94.28 |
256×256 | 96.98 | 96.94 | 95.88 | 96.69 |
模型 | 训练方式 | 准确率 |
---|---|---|
Resnet50 | 预训练 | 91.90 |
微调 | 94.40 | |
Destnet169 | 预训练 | 93.41 |
微调 | 94.61 | |
InceptionV3 | 预训练 | 91.41 |
微调 | 92.88 | |
MobileNetV2 | 预训练 | 92.97 |
微调 | 95.30 | |
本文模型 | 预训练 | 94.39 |
微调 | 96.98 |
Tab. 4 Pretraining and fine-tuning of different models on DataCon dataset
模型 | 训练方式 | 准确率 |
---|---|---|
Resnet50 | 预训练 | 91.90 |
微调 | 94.40 | |
Destnet169 | 预训练 | 93.41 |
微调 | 94.61 | |
InceptionV3 | 预训练 | 91.41 |
微调 | 92.88 | |
MobileNetV2 | 预训练 | 92.97 |
微调 | 95.30 | |
本文模型 | 预训练 | 94.39 |
微调 | 96.98 |
方法 | 评价参数/% | 网络深度 | 参数量 | 图片尺寸/像素 | |||
---|---|---|---|---|---|---|---|
准确率 | 召回率 | 精确率 | F1 | ||||
TL-Resnet50 | 94.40 | 94.40 | 94.67 | 94.22 | 168 | 25 636 712 | 256×256 |
TL-Destnet169 | 94.61 | 94.61 | 94.31 | 94.53 | 169 | 14 307 880 | 256×256 |
TL-InceptionV3 | 92.88 | 92.48 | 92.61 | 92.51 | 126 | 22 910 480 | 256×256 |
TL-MobileNetV2 | 95.30 | 95.23 | 95.41 | 95.28 | 88 | 2 259 265 | 256×256 |
CATM | 96.98 | 96.94 | 95.88 | 96.69 | 92 | 2 357 984 | 256×256 |
Tab. 5 Comparison results of different transfer learning based methods on DataCon dataset
方法 | 评价参数/% | 网络深度 | 参数量 | 图片尺寸/像素 | |||
---|---|---|---|---|---|---|---|
准确率 | 召回率 | 精确率 | F1 | ||||
TL-Resnet50 | 94.40 | 94.40 | 94.67 | 94.22 | 168 | 25 636 712 | 256×256 |
TL-Destnet169 | 94.61 | 94.61 | 94.31 | 94.53 | 169 | 14 307 880 | 256×256 |
TL-InceptionV3 | 92.88 | 92.48 | 92.61 | 92.51 | 126 | 22 910 480 | 256×256 |
TL-MobileNetV2 | 95.30 | 95.23 | 95.41 | 95.28 | 88 | 2 259 265 | 256×256 |
CATM | 96.98 | 96.94 | 95.88 | 96.69 | 92 | 2 357 984 | 256×256 |
注意力机制 | 特征 | 准确 率/% | 召回率/% | 精确率/% | F1/% | 耗时/ms |
---|---|---|---|---|---|---|
SE-TM | 灰度图 | 96.08 | 94.65 | 95.43 | 95.55 | 8.0 |
CBAM-TM | 灰度图 | 96.23 | 95.03 | 95.63 | 95.63 | 9.6 |
CATM | 灰度图 | 96.98 | 96.94 | 95.88 | 96.69 | 7.3 |
Tab. 6 Comparison results of attention mechanisms on DataCon dataset
注意力机制 | 特征 | 准确 率/% | 召回率/% | 精确率/% | F1/% | 耗时/ms |
---|---|---|---|---|---|---|
SE-TM | 灰度图 | 96.08 | 94.65 | 95.43 | 95.55 | 8.0 |
CBAM-TM | 灰度图 | 96.23 | 95.03 | 95.63 | 95.63 | 9.6 |
CATM | 灰度图 | 96.98 | 96.94 | 95.88 | 96.69 | 7.3 |
优化器 | 准确率 | 召回率 | 精确率 | F1 |
---|---|---|---|---|
CATM-Adam | 96.66 | 96.21 | 95.58 | 95.86 |
CATM-Ranger21 | 96.98 | 96.94 | 95.88 | 96.69 |
Tab. 7 Comparison of different optimizers on DataCon dataset
优化器 | 准确率 | 召回率 | 精确率 | F1 |
---|---|---|---|---|
CATM-Adam | 96.66 | 96.21 | 95.58 | 95.86 |
CATM-Ranger21 | 96.98 | 96.94 | 95.88 | 96.69 |
方法 | 特征 | 准确率/% | 召回率/% | 精确率/% | F1/% | 耗时/ms | 同一个家族的变种准确率/% | |
---|---|---|---|---|---|---|---|---|
Swizzor.gen!E | Swizzor.gen!I | |||||||
文献[ | 灰度图 | 94.50 | 94.50 | 94.60 | 94.50 | 20.3 | 71.00 | 62.00 |
文献[ | 灰度图 | 98.48 | 96.56 | 95.80 | 95.80 | — | 88.00 | 82.00 |
文献[ | 直方图+GLCM | 98.58 | 98.06 | 98.04 | 98.05 | — | — | — |
文献[ | 灰度图 | 98.27 | 98.25 | 98.19 | 98.20 | — | 88.00 | 82.00 |
文献[ | RGB | 98.82 | 98.85 | 98.81 | 98.75 | — | 85.00 | 74.00 |
文献[ | AlexNet+RGB | 97.80 | 97.80 | 98.00 | 97.80 | 9.3 | 97.00 | 80.00 |
本文方法 | 灰度图 | 99.26 | 98.99 | 99.14 | 99.11 | 6.4 | 90.00 | 86.00 |
Tab. 8 Comparison of different classification methods on Malimg dataset
方法 | 特征 | 准确率/% | 召回率/% | 精确率/% | F1/% | 耗时/ms | 同一个家族的变种准确率/% | |
---|---|---|---|---|---|---|---|---|
Swizzor.gen!E | Swizzor.gen!I | |||||||
文献[ | 灰度图 | 94.50 | 94.50 | 94.60 | 94.50 | 20.3 | 71.00 | 62.00 |
文献[ | 灰度图 | 98.48 | 96.56 | 95.80 | 95.80 | — | 88.00 | 82.00 |
文献[ | 直方图+GLCM | 98.58 | 98.06 | 98.04 | 98.05 | — | — | — |
文献[ | 灰度图 | 98.27 | 98.25 | 98.19 | 98.20 | — | 88.00 | 82.00 |
文献[ | RGB | 98.82 | 98.85 | 98.81 | 98.75 | — | 85.00 | 74.00 |
文献[ | AlexNet+RGB | 97.80 | 97.80 | 98.00 | 97.80 | 9.3 | 97.00 | 80.00 |
本文方法 | 灰度图 | 99.26 | 98.99 | 99.14 | 99.11 | 6.4 | 90.00 | 86.00 |
方法 | 特征 | 准确率/% | 召回率/% | 精确率/% | F1/% | 耗时/ms | 图片尺寸/像素 |
---|---|---|---|---|---|---|---|
文献[ | 多种特征 | 95.89 | 89.84 | — | 88.55 | — | 256×256 |
DataCon比赛[ | 灰度图 | 96.80 | 96.42 | 96.26 | 97.38 | — | 512×512 |
本文方法 | 灰度图 | 96.98 | 96.94 | 95.88 | 96.69 | 7.2 | 256×256 |
Tab. 9 Comparison of different classification methods on DataCon dataset
方法 | 特征 | 准确率/% | 召回率/% | 精确率/% | F1/% | 耗时/ms | 图片尺寸/像素 |
---|---|---|---|---|---|---|---|
文献[ | 多种特征 | 95.89 | 89.84 | — | 88.55 | — | 256×256 |
DataCon比赛[ | 灰度图 | 96.80 | 96.42 | 96.26 | 97.38 | — | 512×512 |
本文方法 | 灰度图 | 96.98 | 96.94 | 95.88 | 96.69 | 7.2 | 256×256 |
1 | 奇安信.奇安信发布《2020上半年网络安全应急响应分析报告》[EB/OL]. (2020-08-18) [2022-04-20]. . |
QiAnXin. QiAnXin releases Analysis report on network security emergency response in the first half of 2020[EB/OL]. (2020-08-18) [2022-04-20]. . | |
2 | CONTI G, BRATUS S, SHUBINA A, et al. Automated mapping of large binary objects using primitive fragment type classification[J]. Digital Investigation, 2010, 7(S): S3-S12. 10.1016/j.diin.2010.05.002 |
3 | NATARAJ L, YEGNESWARAN V, PORRAS P, et al. A comparative assessment of malware classification using binary texture analysis and dynamic analysis [C]// Proceedings of the 4th ACM Workshop on Security and Artificial Intelligence. New York: ACM, 2011: 21-30. 10.1145/2046684.2046689 |
4 | NATARAJ L, KARTHIKEYAN S, JACOB G, et al. Malware images: visualization and automatic classification [C]// Proceedings of the 8th International Symposium on Visualization for Cyber Security. New York: ACM, 2011: No.4. 10.1145/2016904.2016908 |
5 | CUI Z H, DU L, WANG P H, et al. Malicious code detection based on CNNs and multi-objective algorithm[J]. Journal of Parallel and Distributed Computing, 2019, 129: 50-58. 10.1016/j.jpdc.2019.03.010 |
6 | GIBERT D, MATEU C, PLANES J, et al. Using convolutional neural networks for classification of malware represented as images[J]. Journal of Computer Virology and Hacking Techniques, 2019, 15(1): 15-28. 10.1007/s11416-018-0323-0 |
7 | VASAN D, ALAZAB M, WASSAN S, et al. IMCFN: image-based malware classification using fine-tuned convolutional neural network architecture[J]. Computer Networks, 2020, 171: No.107138. 10.1016/j.comnet.2020.107138 |
8 | VERMA V, MUTTOO S K, SINGH V B. Multiclass malware classification via first-and second-order texture statistics[J]. Computers and Security, 2020, 97: No.101895. 10.1016/j.cose.2020.101895 |
9 | 杨望,高明哲,蒋婷.一种基于多特征集成学习的恶意代码静态检测框架[J].计算机研究与发展, 2021, 58(5): 1021-1034. 10.7544/issn1000-1239.2021.20200912 |
YANG W, GAO M Z, JIANG T. A malicious code static detection framework based on multi-feature ensemble learning[J]. Journal of Computer Research and Development, 2021, 58(5): 1021-1034. 10.7544/issn1000-1239.2021.20200912 | |
10 | 奇安信. DataCon数据集[DB/OL]. [2022-04-25]. . |
QiAnXin. DataCon datasets[DB/OL]. [2022-04-25]. . | |
11 | MAKANDAR A, PATROT A. Malware class recognition using image processing techniques [C]// Proceedings of the 2017 International Conference on Data Management, Analytics and Innovation. Piscataway: IEEE, 2017: 76-80. 10.1109/icdmai.2017.8073489 |
12 | 邱克帆. PE恶意代码智能变异方法的研究与实现[D].天津:南开大学, 2020: 26-40. |
QIU K F. Research and implementation of PE malware intelligent mutation[D]. Nanjing: Nankai University, 2020: 26-40. | |
13 | FUCHS F B, WORRALL D E, FISCHER V, et al. SE (3) -Transformers: 3D roto-translation equivariant attention networks [C]// Proceedings of the 34th International Conference on Neural Information Processing Systems. Red Hook, NY: Curran Associates Inc., 2020: 1970-1981. |
14 | WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module [C]// Proceedings of the 2018 European Conference on Computer Vision, LNCS 11211. Cham: Springer, 2018: 3-19. |
15 | HOU Q B, ZHOU D Q, FENG J S. Coordinate attention for efficient mobile network design [C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 13708-13717. 10.1109/cvpr46437.2021.01350 |
16 | CHEN L. Deep transfer learning for static malware classification[EB/OL]. (2018-12-18) [2022-04-23]. . 10.36227/techrxiv.17259806 |
17 | SANDLER M, HOWARD A, ZHU M L, et al. MobileNetV2: inverted residuals and linear bottlenecks [C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 4510-4520. 10.1109/cvpr.2018.00474 |
18 | WANG W, LI Y T, ZOU T, et al. A novel image classification approach via Dense-MobileNet models[J]. Mobile Information Systems, 2020, 2020: No.7602384. 10.1155/2020/7602384 |
19 | CHENG S L, WANG L J, DU A Y. Asymmetric coordinate attention spectral-spatial feature fusion network for hyperspectral image classification[J]. Scientific Reports, 2021, 11: No.17408. 10.1038/s41598-021-97029-5 |
20 | KIM J H, ON K W, LIM W, et al. Hadamard product for low-rank bilinear pooling[EB/OL]. (2017-03-26) [2022-04-23]. . |
21 | ZHUANG F Z, QI Z Y, DUAN K Y, et al. A comprehensive survey on transfer learning[J]. Proceedings of the IEEE, 2021, 109(1): 43-76. 10.1109/jproc.2020.3004555 |
22 | WRIHT L, DEMEURE N. Ranger21: a synergistic deep learning optimizer[EB/OL]. (2021-08-07) [2022-04-23]. . |
23 | LOSHCHILOV I, HUTTER F. Decoupled weight decay regularization[EB/OL]. (2019-01-04) [2022-04-23]. . |
24 | ZHANG M R, LUCAS J, HINTON G, et al. Lookahead optimizer: k steps forward, 1 step back [C]// Proceedings of the 33rd International Conference on Neural Information Processing Systems. Red Hook, NY: Curran Associates Inc., 2019: 9597-9608. 10.1515/9781618110770 |
25 | 蒋考林,白玮,张磊,等.基于多通道图像深度学习的恶意代码检测[J].计算机应用, 2021, 41(4): 1142-1147. 10.11772/j.issn.1001-9081.2020081224 |
JIANG K L, BAI W, ZHANG L, et al. Malicious code detection based on multi-channel image deep learning[J]. Journal of Computer Applications, 2021, 41(4): 1142-1147. 10.11772/j.issn.1001-9081.2020081224 |
[1] | Zhongdai WU, Dezhi HAN, Haibao JIANG, Cheng FENG, Bing HAN, Chongqing CHEN. Review of marine ship communication cybersecurity [J]. Journal of Computer Applications, 2024, 44(7): 2123-2136. |
[2] | Qun WANG, Quan YUAN, Fujuan LI, Lingling XIA. Review of zero trust network and its key technologies [J]. Journal of Computer Applications, 2023, 43(4): 1142-1150. |
[3] | Wenju LI, Gan ZHANG, Liu CUI, Wanghui CHU. Lightweight traffic sign recognition model based on coordinate attention [J]. Journal of Computer Applications, 2023, 43(2): 608-614. |
[4] | Ronghao LUO, Zhiyou CHENG, Chuanjian WANG, Siqian LIU, Zhentian WANG. Anesthesia resuscitation object detection method based on improved single shot multibox detector [J]. Journal of Computer Applications, 2023, 43(12): 3941-3946. |
[5] | Dawei ZHANG, Xuchong LIU, Wei ZHOU, Zhuhui CHEN, Yao YU. Real-time traffic sign detection algorithm based on improved YOLOv3 [J]. Journal of Computer Applications, 2022, 42(7): 2219-2226. |
[6] | Xiayang SHI, Fengyuan ZHANG, Jiaqi YUAN, Min HUANG. Detection of unsupervised offensive speech based on multilingual BERT [J]. Journal of Computer Applications, 2022, 42(11): 3379-3385. |
[7] | WANG Yue, JIANG Yiming, LAN Julong. Intrusion detection based on improved triplet network and K-nearest neighbor algorithm [J]. Journal of Computer Applications, 2021, 41(7): 1996-2002. |
[8] | ZHANG Quanlong, WANG Huaibin. Intrusion detection model based on combination of dilated convolution and gated recurrent unit [J]. Journal of Computer Applications, 2021, 41(5): 1372-1377. |
[9] | LI Huihui, YAN Kun, ZHANG Lixuan, LIU Wei, LI Zhi. Circular pointer instrument recognition system based on MobileNetV2 [J]. Journal of Computer Applications, 2021, 41(4): 1214-1220. |
[10] | TANG Yanqiang, LI Chenghai, SONG Yafei. Network security situation prediction based on improved particle swarm optimization and extreme learning machine [J]. Journal of Computer Applications, 2021, 41(3): 768-773. |
[11] | HANG Mengxin, CHEN Wei, ZHANG Renjie. Abnormal flow detection based on improved one-dimensional convolutional neural network [J]. Journal of Computer Applications, 2021, 41(2): 433-440. |
[12] | CHENG Xiaohui, NIU Tong, WANG Yanjun. Wireless sensor network intrusion detection system based on sequence model [J]. Journal of Computer Applications, 2020, 40(6): 1680-1684. |
[13] | DENG Xiong, WANG Hongchun. Face liveness detection algorithm based on deep learning and feature fusion [J]. Journal of Computer Applications, 2020, 40(4): 1009-1015. |
[14] | WANG Bo, CAI Honghao, SU Yang. Classification of malicious code variants based on VGGNet [J]. Journal of Computer Applications, 2020, 40(1): 162-167. |
[15] | CHI Yaping, MO Chongwei, YANG Yintan, CHEN Chunxia. Design and implementation of intrusion detection model for software defined network architecture [J]. Journal of Computer Applications, 2020, 40(1): 116-122. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||