Journal of Computer Applications ›› 2018, Vol. 38 ›› Issue (10): 2929-2933.DOI: 10.11772/j.issn.1001-9081.2018030691

Previous Articles     Next Articles

Malicious file detection method based on image texture and convolutional neural network

JIANG Chen, HU Yupeng, SI Kai, KUANG Wenxin   

  1. College of Information Science and Engineering, Hunan University, Changsha 410000, China
  • Received:2018-04-04 Revised:2018-05-10 Online:2018-10-10 Published:2018-10-13
  • Supported by:
    This work is partially supported by the Huxiang Youth Talent Plan (2017RS3018).

基于图像纹理和卷积神经网络的恶意文件检测方法

蒋晨, 胡玉鹏, 司凯, 旷文鑫   

  1. 湖南大学 信息科学与工程学院, 长沙 410000
  • 通讯作者: 胡玉鹏
  • 作者简介:蒋晨(1992-),女,安徽滁州人,硕士研究生,主要研究方向:Android安全、分布式云存储;胡玉鹏(1981-),男,湖南衡阳人,副教授,博士,主要研究方向:云存储安全与可靠性、分布式云存储、Android安全;司凯(1992-),男,河南周口人,硕士研究生,主要研究方向:自然语言处理、机器学习、深度学习;旷文鑫(1993-),女,湖南衡阳人,博士研究生,主要研究方向:社交网络、云计算、分布式存储。
  • 基金资助:
    湖湘青年英才计划项目(2017RS3018)。

Abstract: In big data environment, traditional malicious file detection methods have low detection accuracy for malicious files after code variant and confusion, and weak versatility of cross-platform malicious files. To resolve these problems, a malicious file detection method based on image texture and Convolutional Neural Network (CNN) was proposed. Firstly, a grayscale image generation algorithm was used to convert the executable files on Android and Windows platforms, namely.dex and.exe files, into corresponding grayscale images. Then, the texture features of these grayscale images were automatically extracted and learned by using CNN algorithm, to construct a malicious file detection model. Finally, a large number of unknown files were used to test the accuracy of the proposed model. The experimental results on a large number of malicious samples showed that the highest accuracy of the proposed model on Android platform and Windows platform reached 79.6% and 97.6%, and the average accuracy were approximately 79.3% and 96.8%, respectively. Compared with the texture fingerprint-based malicious code detection method, the accuracy of the proposed method was improved by about 20%. Experimenatal results indicate that the proposed method can effectively avoid the problems caused by manual screening features, greatly improve the detection accuracy and efficiency, successfully solve the cross-platform detection problem, and achieve an end-to-end malicious file detection model.

Key words: big data, malicious file detection, deep learning, grayscale image, Convolutional Neural Network (CNN)

摘要: 在大数据环境下,针对传统恶意文件检测方法对经过代码变种和混淆后的恶意文件检测准确率低以及对跨平台恶意文件检测通用性弱等问题,提出一种基于图像纹理和卷积神经网络的恶意文件检测方法。首先,使用灰度图像生成算法将Android和Windows平台下可执行文件,即.dex和.exe文件,转换成相应的灰度图像;然后,通过卷积神经网络(CNN)算法自动提取这些灰度图像的纹理特征并加以学习训练,从而构建出一个恶意文件检测模型;最后,使用大量未知待检测的文件去验证模型检测准确率的高低。通过对大量的恶意样本进行实验,在Android和Windows平台下,模型检测最高准确率分别达到79.6%和97.6%,平均准确率分别约为79.3%和96.8%;与基于纹理指纹的恶意代码变种检测方法相比,基于图像纹理和卷积神经网络的恶意文件检测方法准确率提高了约20%。实验结果表明,所提方法能够有效避免人工筛选特征带来的问题,大幅提高检测的准确率和效率,成功解决跨平台检测问题,实现了一种端到端的恶意文件检测模型。

关键词: 大数据, 恶意文件检测, 深度学习, 灰度图像, 卷积神经网络

CLC Number: