基于多通道图像深度学习的恶意代码检测

doi:10.11772/j.issn.1001-9081.2020081224

计算机应用 ›› 2021, Vol. 41 ›› Issue (4): 1142-1147.DOI: 10.11772/j.issn.1001-9081.2020081224

所属专题：网络空间安全

基于多通道图像深度学习的恶意代码检测

蒋考林, 白玮, 张磊, 陈军, 潘志松, 郭世泽

陆军工程大学指挥控制工程学院, 南京 210007

收稿日期:2020-08-13 修回日期:2020-10-19 出版日期:2021-04-10 发布日期:2020-11-25
通讯作者: 潘志松
作者简介:蒋考林（1996—），男，江西乐平人，硕士研究生，主要研究方向：应用软件安全、深度学习；白玮（1983—），男，河北赤城人，讲师，博士，主要研究方向：网络安全管理、网络脆弱性分析；张磊（1989—），男，江西宜春人，博士研究生，主要研究方向：网络安全、人工智能、强化学习；陈军（1986—），男，四川乐至人，博士研究生，主要研究方向：恶意代码检测、深度学习；潘志松（1973—），男，福建诏安人，教授，博士，主要研究方向：深度学习、模式识别；郭世泽（1969—），男，河北蠡县人，教授，博士，主要研究方向：信息技术、信息安全。
基金资助:
国家重点研发计划项目（2017YFB0802800）。

Malicious code detection based on multi-channel image deep learning

JIANG Kaolin, BAI Wei, ZHANG Lei, CHEN Jun, PAN Zhisong, GUO Shize

Command and Control Engineering College, Army Engineering University Nanjing Jiangsu 210007, China

Received:2020-08-13 Revised:2020-10-19 Online:2021-04-10 Published:2020-11-25
Supported by:
This work is partially supported by the National Key Research and Development Program of China (2017YFB0802800).

摘要/Abstract

摘要： 现有基于深度学习的恶意代码检测方法存在深层次特征提取能力偏弱、模型相对复杂、模型泛化能力不足等问题。同时，代码复用现象在同一类恶意样本中大量存在，而代码复用会导致代码的视觉特征相似，这种相似性可以被用来进行恶意代码检测。因此，提出一种基于多通道图像视觉特征和AlexNet神经网络的恶意代码检测方法。该方法首先将待检测的代码转化为多通道图像，然后利用AlexNet神经网络提取其彩色纹理特征并对这些特征进行分类从而检测出可能的恶意代码；同时通过综合运用多通道图像特征提取、局部响应归一化（LRN）等技术，在有效降低模型复杂度的基础上提升了模型的泛化能力。利用均衡处理后的Malimg数据集进行测试，结果显示该方法的平均分类准确率达到97.8%；相较于VGGNet方法在准确率上提升了1.8%，在检测效率上提升了60.2%。实验结果表明，多通道图像彩色纹理特征能较好地反映恶意代码的类别信息，AlexNet神经网络相对简单的结构能有效地提升检测效率，而局部响应归一化能提升模型的泛化能力与检测效果。

关键词: 多通道图像, 彩色纹理特征, 恶意代码, 深度学习, 局部响应归一化

Abstract: Existing deep learning-based malicious code detection methods have problems such as weak deep-level feature extraction capability, relatively complex model and insufficient model generalization capability. At the same time, code reuse phenomenon occurred in large number of malicious samples of the same type, resulting in similar visual features of the code. This similarity can be used for malicious code detection. Therefore, a malicious code detection method based on multi-channel image visual features and AlexNet was proposed. In the method, the codes to be detected were converted into multi-channel images at first. After that, AlexNet was used to extract and classify the color texture features of the images, so as to detect the possible malicious codes. Meanwhile, the multi-channel image feature extraction, the Local Response Normalization(LRN) and other technologies were used comprehensively, which effectively improved the generalization ability of the model with effective reduction of the complexity of the model. The Malimg dataset after equalization was used for testing, the results showed that the average classification accuracy of the proposed method was 97.8%, and the method had the accuracy increased by 1.8% and the detection efficiency increased by 60.2% compared with the VGGNet method. Experimental results show that the color texture features of multi-channel images can better reflect the type information of malicious codes, the simple network structure of AlexNet can effectively improve the detection efficiency, and the local response normalization can improve the generalization ability and detection effect of the model.

Key words: multi-channel image, color texture feature, malicious code, deep learning, Local Response Normalization(LRN)

中图分类号:

TP309

蒋考林, 白玮, 张磊, 陈军, 潘志松, 郭世泽. 基于多通道图像深度学习的恶意代码检测[J]. 计算机应用, 2021, 41(4): 1142-1147.

JIANG Kaolin, BAI Wei, ZHANG Lei, CHEN Jun, PAN Zhisong, GUO Shize. Malicious code detection based on multi-channel image deep learning[J]. Journal of Computer Applications, 2021, 41(4): 1142-1147.

参考文献

[1] 国家计算机网络应急技术处理协调中心. 2019年中国互联网网络安全报告[R]. 北京:国家计算机网络应急技术处理协调中心,2020:24. (National Computer Network Emergency Response Technical Team/Coordination Center of China. 2019 China Internet Cybersecurity Report[R]. Beijing:CNCERT/CC,2020:24.)
[2] FIRDAUS A,ANUAR N B,KARIM A,et al. Discovering optimal features using static analysis and a genetic search based method for Android malware detection[J]. Frontiers of Information Technology and Electronic Engineering,2018,19(6):712-736.
[3] YAN P,YAN Z. A survey on dynamic mobile malware detection[J]. Software Quality Journal,2018,26(3):891-919.
[4] QIU J,LUO W,PAN L,et al. Predicting the impact of Android malicious samples via machine learning[J]. IEEE Access,2019, 7:66304-66316.
[5] EL MERABET H,HAJRAOUI A. A survey of malware detection techniques based on machine learning[J]. International Journal of Advanced Computer Science and Applications,2019,10(1):366-373.
[6] KRIZHEVSKY A,SUTSKEVER I,HINTON G E,et al. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM,2017,60(6):84-90.
[7] 王博, 蔡弘昊, 苏旸. 基于VGGNet的恶意代码变种分类[J]. 计算机应用,2020,40(1):162-167.(WANG B,CAI H H,SU Y. Classification of malicious code variants based on VGGNet[J]. Journal of Computer Applications,2020,40(1):162-167.)
[8] 王怀军, 房鼎益, 李光辉, 等. 基于变形的二进制代码混淆技术研究[J]. 四川大学学报:工程科学版,2014,46(1):14-21. (WANG H J,FANG D Y,LI G H,et al. Research on deformation based binary code obfuscation technology[J]. Journal of Sichuan University(Engineering Science Edition),2014,46(1):14-21.)
[9] NIKOLOPOULOS S,POLENAKIS I. A graph-based model for malware detection and classification using system-call groups[J]. Journal of Computer Virology and Hacking Techniques,2017,13(1):29-46.
[10] HAN L,FU C,ZOU D,et al. Task-based behavior detection of illegal codes[J]. Mathematical and Computer Modelling,2012, 55(1/2):80-86.
[11] 张灿岩. 用于恶意代码检测的沙箱技术研究[D]. 哈尔滨:哈尔滨工程大学,2013:1.(ZHANG C Y. Research on sandbox technology for malicious code detection[D]. Harbin:Harbin Engineering University,2013:1.)
[12] 秦中元, 王志远, 吴伏宝, 等. 基于多级签名匹配算法的Android恶意应用检测[J]. 计算机应用研究,2016,33(3):891-895. (QIN Z Y,WANG Z Y,WU F B,et al. Android malware detection based on multi-level signature matching[J]. Application Research of Computers,2016,33(3):891-895.)
[13] WILLEMS C,HOLZ T,FREILING F,et al. Toward automated dynamic malware analysis using CWSandbox[J]. IEEE on Security and Privacy,2007,5(2):32-39.
[14] TANABE R,UENO W,ISHII K,et al. Evasive malware via identifier implanting[C]//Proceedings of the 2018 International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment,LNCS 10885. Cham:Springer,2018:162-184.
[15] KI Y,KIM E,KIM H. A novel approach to detect malware based on API call sequence analysis[J]. International Journal of Distributed Sensor Networks,2015,4:659101.
[16] PARK Y,REEVES D,MULUKUTLA V,et al. Fast malware classification by automated behavioral graph matching[C]//Proceedings of the 6th Annual Workshop on Cyber Security and Information Intelligence Research. New York:ACM, 2010:No. 45.
[17] KIM T,KANG B,RHO M,et al. A multimodal deep learning method for Android malware detection using various features[J]. IEEE Transactions on Information Forensics and Security,2019, 14(3):773-788.
[18] 荣俸萍, 方勇, 左政, 等. MACSPMD:基于恶意API调用序列模式挖掘的恶意代码检测[J]. 计算机科学,2018,45(5):131-138. (RONG F P, FANG Y, ZUO Z, et al. MACSPMD:malicious API call sequential pattern mining based malware detection[J]. Computer Science,2018,45(5):131-138.)
[19] NATARAJ L,KARTHIKEYAN S,JACOB G,et al. Malware images:visualization and automatic classification[C]//Proceedings of the 8th International Symposium on Visualization for Cyber Security. New York:ACM,2011:No. 4.
[20] BHODIA N, PRAJAPATI P, DI TROIA F, et al. Transfer learning for image-based malware classification[EB/OL].[2020-02-10]. https://arxiv.org/pdf/1903.11551.pdf.
[21] FU J,XUE J,WANG Y,et al. Malware visualization for finegrained classification[J]. IEEE Access,2018,6:14510-14523.
[22] CUI Z,XUE F,CAI X,et al. Detection of malicious code variants based on deep learning[J]. IEEE Transactions on Industrial Informatics,2018,14(7):3187-3196.

基于多通道图像深度学习的恶意代码检测

Malicious code detection based on multi-channel image deep learning

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

[1]	郑志强, 胡鑫, 翁智, 王雨禾, 程曦. 基于改进DenseNet的牛眼图像特征提取方法[J]. 计算机应用, 2021, 41(9): 2780-2784.
[2]	陈成瑞, 孙宁, 何世彪, 廖勇. 面向C-V2X通信的基于深度学习的联合信道估计与均衡算法[J]. 计算机应用, 2021, 41(9): 2687-2693.
[3]	赵宏, 孔东一. 图像特征注意力与自适应注意力融合的图像内容中文描述[J]. 计算机应用, 2021, 41(9): 2496-2503.
[4]	徐江浪, 李林燕, 万新军, 胡伏原. 结合目标检测的室内场景识别方法[J]. 计算机应用, 2021, 41(9): 2720-2725.
[5]	谢德峰, 吉建民. 融入句法感知表示进行句法增强的语义解析[J]. 计算机应用, 2021, 41(9): 2489-2495.
[6]	代雨柔, 杨庆, 张凤荔, 周帆. 基于自监督学习的社交网络用户轨迹预测模型[J]. 计算机应用, 2021, 41(9): 2545-2551.
[7]	何正海, 线岩团, 王蒙, 余正涛. 融合句法指导与字符注意力机制的案情阅读理解方法[J]. 计算机应用, 2021, 41(8): 2427-2431.
[8]	曹玉红, 徐海, 刘荪傲, 王紫霄, 李宏亮. 基于深度学习的医学影像分割研究综述[J]. 计算机应用, 2021, 41(8): 2273-2287.
[9]	秦斌斌, 彭良康, 卢向明, 钱江波. 司机分心驾驶检测研究进展[J]. 计算机应用, 2021, 41(8): 2330-2337.
[10]	李亚芳, 梁烨, 冯韦玮, 祖宝开, 康玉健. 基于社区优化的深度网络嵌入方法[J]. 计算机应用, 2021, 41(7): 1956-1963.
[11]	高钦泉, 黄炳城, 刘文哲, 童同. 基于改进CenterNet的竹条表面缺陷检测方法[J]. 计算机应用, 2021, 41(7): 1933-1938.
[12]	王月, 江逸茗, 兰巨龙. 基于改进三元组网络和K近邻算法的入侵检测[J]. 计算机应用, 2021, 41(7): 1996-2002.
[13]	杜炎, 吕良福, 焦一辰. 基于模糊推理的模糊原型网络[J]. 计算机应用, 2021, 41(7): 1885-1890.
[14]	侯笑晗, 金国栋, 谭力宁, 薛远亮. 基于自适应和最优特征的合成孔径雷达舰船检测方法[J]. 计算机应用, 2021, 41(7): 2150-2155.
[15]	刘世泽, 朱奕达, 陈润泽, 罗海勇, 赵方, 孙艺, 王宝会. 基于残差时域注意力神经网络的交通模式识别算法[J]. 计算机应用, 2021, 41(6): 1557-1565.