基于多元属性特征的恶意域名检测

doi:10.11772/j.issn.1001-9081.2016.04.0941

计算机应用 ›› 2016, Vol. 36 ›› Issue (4): 941-944.DOI: 10.11772/j.issn.1001-9081.2016.04.0941

基于多元属性特征的恶意域名检测

张洋, 柳厅文, 沙泓州, 时金桥

中国科学院信息工程研究所, 北京 100093

收稿日期:2015-08-31 修回日期:2015-11-02 发布日期:2016-04-08 出版日期:2016-04-10
通讯作者: 张洋
作者简介:张洋(1991-),男,山东临沂人,硕士研究生,主要研究方向:网络与信息安全; 柳厅文(1986-),男,安徽临泉人,助理研究员,博士,CCF会员,主要研究方向:大数据安全分析、知识图谱; 沙泓州(1988-),男,江苏淮安人,博士,主要研究方向:信息安全、数据挖掘;时金桥(1978-),男,黑龙江哈尔滨人,正研级高级工程师,博士,CCF会员,主要研究方向:信息安全、数据挖掘。
基金资助:
国家自然科学基金资助项目(61303260);中国科学院战略性先导科技专项(XDA06030200)。

Malicious domain detection based on multiple-dimensional features

ZHANG Yang, LIU Tingwen, SHA Hongzhou, SHI Jinqiao

Institute of Information Engineering, Chinese Academy of Sciences, Beijing 100093, China

Received:2015-08-31 Revised:2015-11-02 Online:2016-04-08 Published:2016-04-10
Supported by:
This work is supported by the National Natural Science Foundation of China (61303260), the Strategic Priority Research Program of the Chinese Academy of Sciences (XDA06030200).

摘要/Abstract

摘要： 域名系统主要提供域名解析功能,完成域名到IP的转换,而恶意域名检测主要用来发现以域名系统为屏障的非法行为,来保障域名服务器的正常运行。总结了恶意域名检测的相关工作,并采用基于机器学习的方法,提出一种基于多元属性特征的恶意域名检测方法。在域名词法特征方面,提取更加细粒度的特征,比如数字字母的转换频率、连续字母的最大长度等;在网络属性特征方面,更加关注名称服务器,比如其个数、分散度等。实验结果表明,该方法的准确率、召回率、F1值均达到了99.8%,具有较好的检测效果。

关键词: 恶意域名, 域名系统, 网络钓鱼, 随机森林

Abstract: Domain Name System (DNS) provides domain name resolution service, i.e., converting domain names to IP addresses. Malicious domain detection is mainly for discovering illegal activities and ensuring the normal operation of the domain name servers. Prior work on malicious domain name detection was summarized, and a new machine learning based malicious domain detection algorithm for exploiting multiple-dimensional features was further proposed. With respect to domain name lexical features, more fine-grained features were extracted, such as the conversion frequency of the numbers and letters and the maximum length of continuous letters. As for the network attribute features, more attentions were paid to the name servers, such as the quantity, and the degree of dispersion. The experimental results show that the accuracy, recall rate, F1 value of the proposed method reaches 99.8%, which means a better performance on malicious domain name detection.

Key words: malicious domain, Domain Name System (DNS), phishing, random forests

中图分类号:

TP309

张洋, 柳厅文, 沙泓州, 时金桥. 基于多元属性特征的恶意域名检测[J]. 计算机应用, 2016, 36(4): 941-944.

ZHANG Yang, LIU Tingwen, SHA Hongzhou, SHI Jinqiao. Malicious domain detection based on multiple-dimensional features[J]. Journal of Computer Applications, 2016, 36(4): 941-944.

参考文献

[1] 王永杰,刘京菊.基于DNS协议的隐蔽通道原理及性能分析[J].计算机工程,2014,40(7):102-105.(WANG Y J, LIU J J. DNS-based covert channel principle and performance analysis[J]. Computer Engineering, 2014,40(7):102-105.)
[2] CNCERT/CC. 2014中国互联网网络安全报告[EB/OL].[2015-08-15]. http://www.cert.org.cn/publish/main/upload/File/2014%20secirity%20situation%20report.pdf.(CNCERT/CC. The 2014 Internet security report in China[EB/OL].[2015-08-15]. http://www.cert.org.cn/publish/main/upload/File/2014%20secirity%20situation%20report.pdf.)
[3] BILGE L, KIRDA E, KRUEGEL C, et al. EXPOSURE:finding malicious domains using passive DNS analysis[EB/OL].[2015-07-06]. http://seclab.ccs.neu.edu/static/publications/ndss2011dns.pdf.
[4] 洪博,耿光刚,王利明,等.一种基于 DNS 主动检测钓鱼攻击的系统[J].计算机应用研究,2013,30(12):3771-3774.(HONG B, GENG G G, WANG L M, et al. System to discover phishing attacks actively based on DNS[J]. Application Research of Computers, 2013, 30(12):3771-3774.)
[5] ZHANG Y, HONG J I, CRANOR L F. Cantina:a content-based approach to detecting phishing Web sites[C]//Proceedings of the 2007 16th International Conference on World Wide Web. New York:ACM, 2007:639-648.
[6] WEIMER F. Passive DNS replication[EB/OL].[2015-07-06]. http://www.first.org/conference/2005/papers/florian-weimer-paper-1.pdf.
[7] PAN Y, DING X. Anomaly based Web phishing page detection[C]//Proceedings of the 22nd Annual Computer Security Applications Conference. Washington, DC:IEEE Computer Society, 2006:381-392.
[8] HOLZ T, GORECKI C, RIECK K, et al. Measuring and detecting fast-flux service networks[EB/OL].[2015-07-12]. http://user.informatik.uni-goettingen.de/~krieck/docs/2008-ndss.pdf.
[9] ZHOU C V, LECKIE C, KARUNASEKERA S, et al. A self-healing, self-protecting collaborative intrusion detection architecture to trace-back fast-flux phishing domains[C]//Proceedings of the 2008 IEEE Network Operations and Management Symposium Workshops. Piscataway, NJ:IEEE, 2008:321-327.
[10] BASNET R, MUKKAMALA S, SUNG A H. Detection of phishing attacks:a machine learning approach[M]//PRASAD B. Soft Computing Applications in Industry. Berlin:Springer, 2008, 226:373-383.
[11] PASSERINI E, PALEARI R, MARTIGNONI L, et al. FluXOR:detecting and monitoring fast-flux service networks[M]//ZAMBONI D. Detection of Intrusions and Malware, and Vulnerability Assessment, LNCS 5137. Berlin:Springer, 2008:186-206.
[12] PERDISCI R, CORONA I, DAGON D, et al. Detecting malicious flux service networks through passive analysis of recursive DNS traces[C]//Proceedings of the 2009 Annual Computer Security Applications Conference. Washington, DC:IEEE Computer Society, 2009:311-320.
[13] CHAU D H, NACHENBERG C, WILHELM J, et al. Polonium:Tera-scale graph mining for malware detection[EB/OL].[2015-07-12]. http://epubs.siam.org/doi/pdf/10.1137/1.9781611972818.12.
[14] MANADHATA P, YADAV S, RAO P, et al. Detecting malicious domains via graph inference[C]//Proceedings of the 2014 Workshop on Artificial Intelligent and Security Workshop. New York:ACM, 2014:59-60.

[1]	翟冉, 陈学斌, 张国鹏, 裴浪涛, 马征. 基于不同敏感度的改进K-匿名隐私保护算法[J]. 《计算机应用》唯一官方网站, 2023, 43(5): 1497-1503.
[2]	徐精诚, 陈学斌, 董燕灵, 杨佳. 融合特征选择的随机森林DDoS攻击检测[J]. 《计算机应用》唯一官方网站, 2023, 43(11): 3497-3503.
[3]	谢康, 姜国庆, 郭杭鑫, 刘峥. 基于改进GM（1，n）的动态网络舆情预警模型[J]. 《计算机应用》唯一官方网站, 2023, 43(1): 299-305.
[4]	张杨, 董士程. 面向并发程序中锁机制的智能化推荐方法[J]. 计算机应用, 2021, 41(6): 1597-1603.
[5]	余东昌, 赵文芳, 聂凯, 张舸. 基于LightGBM算法的能见度预测模型[J]. 《计算机应用》唯一官方网站, 2021, 41(4): 1035-1041.
[6]	张增辉, 姜高霞, 王文剑. 基于局部概率抽样的标签噪声过滤方法[J]. 计算机应用, 2021, 41(1): 67-73.
[7]	周翔, 翟俊海, 黄雅婕, 申瑞彩, 侯璎真. 基于随机森林和投票机制的大数据样例选择算法[J]. 计算机应用, 2021, 41(1): 74-80.
[8]	肖跃雷, 张云娇. 基于特征选择和超参数优化的恐怖袭击组织预测方法[J]. 计算机应用, 2020, 40(8): 2262-2267.
[9]	聂茜婵, 张阳, 余敦辉, 张兴盛. 面向全局优化的时空众包任务分配算法[J]. 计算机应用, 2020, 40(7): 1950-1958.
[10]	余敦辉, 袁旭, 张万山, 王晨旭. 基于动态阈值的时空众包在线分配算法[J]. 计算机应用, 2020, 40(3): 658-664.
[11]	王治忠, 钱龙龙, 韩闯, 师丽. 基于统计特征和熵特征融合的心肌梗死辅助诊断方法[J]. 《计算机应用》唯一官方网站, 2020, 40(2): 608-615.
[12]	陈禹, 毛莺池. 基于随机森林和遗传算法的Ceph参数自动调优[J]. 《计算机应用》唯一官方网站, 2020, 40(2): 347-351.
[13]	郎大鹏, 丁巍, 姜昊辰, 陈志远. 基于多特征融合的恶意代码分类算法[J]. 计算机应用, 2019, 39(8): 2333-2338.
[14]	何新宇, 张晓龙. 基于深度神经网络的肺炎图像识别模型[J]. 计算机应用, 2019, 39(6): 1680-1684.
[15]	田臣, 周丽娟. 基于带多数类权重的少数类过采样技术和随机森林的信用评估方法[J]. 计算机应用, 2019, 39(6): 1707-1712.

基于多元属性特征的恶意域名检测

Malicious domain detection based on multiple-dimensional features

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics