计算机应用 ›› 2016, Vol. 36 ›› Issue (4): 941-944.DOI: 10.11772/j.issn.1001-9081.2016.04.0941

• 网络空间安全 • 上一篇    下一篇

基于多元属性特征的恶意域名检测

张洋, 柳厅文, 沙泓州, 时金桥   

  1. 中国科学院 信息工程研究所, 北京 100093
  • 收稿日期:2015-08-31 修回日期:2015-11-02 出版日期:2016-04-10 发布日期:2016-04-08
  • 通讯作者: 张洋
  • 作者简介:张洋(1991-),男,山东临沂人,硕士研究生,主要研究方向:网络与信息安全; 柳厅文(1986-),男,安徽临泉人,助理研究员,博士,CCF会员,主要研究方向:大数据安全分析、知识图谱; 沙泓州(1988-),男,江苏淮安人,博士,主要研究方向:信息安全、数据挖掘;时金桥(1978-),男,黑龙江哈尔滨人,正研级高级工程师,博士,CCF会员,主要研究方向:信息安全、数据挖掘。
  • 基金资助:
    国家自然科学基金资助项目(61303260);中国科学院战略性先导科技专项(XDA06030200)。

Malicious domain detection based on multiple-dimensional features

ZHANG Yang, LIU Tingwen, SHA Hongzhou, SHI Jinqiao   

  1. Institute of Information Engineering, Chinese Academy of Sciences, Beijing 100093, China
  • Received:2015-08-31 Revised:2015-11-02 Online:2016-04-10 Published:2016-04-08
  • Supported by:
    This work is supported by the National Natural Science Foundation of China (61303260), the Strategic Priority Research Program of the Chinese Academy of Sciences (XDA06030200).

摘要: 域名系统主要提供域名解析功能,完成域名到IP的转换,而恶意域名检测主要用来发现以域名系统为屏障的非法行为,来保障域名服务器的正常运行。总结了恶意域名检测的相关工作,并采用基于机器学习的方法,提出一种基于多元属性特征的恶意域名检测方法。在域名词法特征方面,提取更加细粒度的特征,比如数字字母的转换频率、连续字母的最大长度等;在网络属性特征方面,更加关注名称服务器,比如其个数、分散度等。实验结果表明,该方法的准确率、召回率、F1值均达到了99.8%,具有较好的检测效果。

关键词: 恶意域名, 域名系统, 网络钓鱼, 随机森林

Abstract: Domain Name System (DNS) provides domain name resolution service, i.e., converting domain names to IP addresses. Malicious domain detection is mainly for discovering illegal activities and ensuring the normal operation of the domain name servers. Prior work on malicious domain name detection was summarized, and a new machine learning based malicious domain detection algorithm for exploiting multiple-dimensional features was further proposed. With respect to domain name lexical features, more fine-grained features were extracted, such as the conversion frequency of the numbers and letters and the maximum length of continuous letters. As for the network attribute features, more attentions were paid to the name servers, such as the quantity, and the degree of dispersion. The experimental results show that the accuracy, recall rate, F1 value of the proposed method reaches 99.8%, which means a better performance on malicious domain name detection.

Key words: malicious domain, Domain Name System (DNS), phishing, random forests

中图分类号: