Journal of Computer Applications ›› 2021, Vol. 41 ›› Issue (3): 780-785.DOI: 10.11772/j.issn.1001-9081.2020060906

Special Issue: 网络空间安全

• Cyber security • Previous Articles     Next Articles

Malware detection method based on perceptual hash algorithm and feature fusion

JIANG Qianyu, WANG Fengying, JIA Lipeng   

  1. School of Computer Science and Technology, Shandong University of Technology, Zibo Shandong 255022, China
  • Received:2020-06-27 Revised:2020-11-02 Online:2021-03-10 Published:2021-01-15
  • Supported by:
    This work is partially supported by the Integration Development Program of School and City of Zibo (2108ZBXC295).

基于感知哈希算法和特征融合的恶意代码检测方法

姜倩玉, 王凤英, 贾立鹏   

  1. 山东理工大学 计算机科学与技术学院, 山东 淄博 255022
  • 通讯作者: 王凤英
  • 作者简介:姜倩玉(1995-),女,山东泰安人,硕士研究生,主要研究方向:网络安全;王凤英(1962-),女,山东淄博人,教授,硕士,主要研究方向:网络安全;贾立鹏(1995-),男,山东滨州人,硕士研究生,主要研究方向:网络安全。
  • 基金资助:
    淄博市校城融合发展计划项目(2018ZBXC295)。

Abstract: In the current detection of the malware family, the local features or global features extracted through the grayscale image of the malware cannot fully describe the malware. Aiming at the problem and to improve the detection effect, a malware detection method based on perceptual hash algorithm and feature fusion was proposed. Firstly, the grayscale image samples of malware were detected through the perceptual hash algorithm, and samples of specific malware families and uncertain malware families were quickly divided. Experimental tests showed that about 67% malwares were able to be detected by the perceptual hash algorithm. Then, the local features of Local Binary Pattern (LBP) and global features of Gist were further extracted for the samples of uncertain families, and the features of merging the above two features were used to classify and detect the malware samples by the machine learning algorithm. Finally, experimental results of the detection of 25 types of malware families show that the detection accuracy is higher when using the fusion feature of LBP and Gist compared to that when using a single feature only, and the proposed method is more efficient in classification and detection than the detection algorithm using machine learning only with the detection speed increased by 93.5%.

Key words: malware family detection, perceptual hash, image feature, feature fusion, machine learning

摘要: 在当前的恶意代码家族检测中,通过恶意代码灰度图像提取的局部特征或全局特征无法全面描述恶意代码,针对这个问题并为提高检测效率,提出了一种基于感知哈希算法和特征融合的恶意代码检测方法。首先,通过感知哈希算法对恶意代码灰度图样本进行检测,快速划分出具体恶意代码家族和不确定恶意代码家族的样本,实验测试表明约有67%的恶意代码能够通过感知哈希算法检测出来。然后,对于不确定恶意代码家族样本再进一步提取局部特征局部二值模式(LBP)与全局特征Gist,并利用二者融合后的特征通过机器学习算法对恶意代码样本进行分类检测。最后,对于25类恶意代码家族检测的实验结果表明,相较于仅用单一特征,使用LBP与Gist的融合特征时的检测准确率更高,并且所提方法与仅采用机器学习的检测算法相比分类检测效率更高,检测速度提高了93.5%。

关键词: 恶意代码家族检测, 感知哈希, 图像特征, 特征融合, 机器学习

CLC Number: