计算机应用 ›› 2015, Vol. 35 ›› Issue (4): 972-976.DOI: 10.11772/j.issn.1001-9081.2015.04.0972

• 信息安全 • 上一篇    下一篇

基于支持向量机的恶意软件行为评估系统

欧阳博宇, 刘新, 徐婵, 吴建, 安晓   

  1. 湘潭大学 信息工程学院, 湖南 湘潭 411105
  • 收稿日期:2014-11-04 修回日期:2014-12-30 出版日期:2015-04-10 发布日期:2015-04-08
  • 通讯作者: 刘新
  • 作者简介:欧阳博宇(1989-),男,湖南湘潭人,硕士研究生,CCF会员,主要研究方向:信息安全; 刘新(1975-),男,湖南湘潭人,副教授,博士,CCF会员,主要研究方向:信息安全、社会计算; 徐婵(1988-),女,湖南衡阳人,硕士研究生,CCF会员,主要研究方向:信息安全;吴建(1990-),男,湖南常德人,硕士研究生,CCF会员,主要研究方向:信息安全; 安晓(1990-),女,河南南阳人,硕士研究生,CCF会员,主要研究方向:信息检索。
  • 基金资助:

    湖南省自然科学基金资助项目(12JJ3066);教育部重点实验室开放课题基金资助项目(2013IM02);湖南省"十二五"重点学科建设基金资助项目。

Malware behavior assessment system based on support vector machine

OUYANG Boyu, LIU Xin, XU Chan, WU Jian, AN Xiao   

  1. College of Information Engineering, Xiangtan University, Xiangtan Hunan 411105, China
  • Received:2014-11-04 Revised:2014-12-30 Online:2015-04-10 Published:2015-04-08

摘要:

为解决恶意软件行为分析系统中分类准确率较低的问题,提出了一种基于支持向量机(SVM)的恶意软件分类方法。首先人工建立了一个以软件行为结果作为特征的危险行为库;然后捕获软件所有行为,并与危险行为库进行匹配,通过样本转换算法将匹配结果变成适合SVM处理的数据,再利用SVM进行分类。在SVM模型、核函数以及参数对(C,g)的选择方面先进行理论分析确定大致范围,再使用网格搜索和遗传算法(GA)相结合的方式进行寻优。为验证所提恶意软件分类方法的有效性,设计了一个基于SVM模型的恶意软件行为评估系统。实验结果表明,该系统的误报率和漏报率分别为5.52%和3.04%,比K近邻(KNN)、朴素贝叶斯(NB)算法更好,与反向传播(BP)神经网络相当,但比BP神经网络的训练和分类效率更高。

关键词: 恶意软件, 支持向量机, 遗传算法, 行为评估

Abstract:

Aiming at the problem that the classification accuracy in malware behavior analysis system was low,a malware classification method based on Support Vector Machine (SVM) was proposed. First, the risk behavior library which used software behavior results as characteristics was established manually. Then all of the software behaviors were captured and matched with the risk behavior library, and the matching results were converted to data suitable for SVM training through the conversion algorithm. In the selection of the SVM model, kernel function and parameters (C,g), a method combining the grid search and Genetic Algorithm (GA) was used to search optimization after theoretical analysis. A malware behavior assessment system based on SVM classification model was designed to verify the effectiveness of the proposed malware classification method. The experiments show that the false positive rate and false negative rate of the system were 5.52% and 3.04% respectively. It means that the proposed method outperforms K-Nearest Neighbor (KNN) and Naive Bayes (NB); its performance is at the same level with the BP neural network, however, it has a higer efficiency in training and classification.

Key words: malware, Support Vector Machine (SVM), Genetic Algorithm(GA), behavior evaluation

中图分类号: