Journal of Computer Applications ›› 2021, Vol. 41 ›› Issue (10): 2900-2904.DOI: 10.11772/j.issn.1001-9081.2020122002

Special Issue: 网络空间安全

• Cyber security • Previous Articles     Next Articles

Protocol identification approach based on semi-supervised subspace clustering

ZHU Yuna1, ZHANG Yutao2, YAN Shaoge3, FAN Yudan3, CHEN Hantuo4   

  1. 1. Troops 91033 of PLA, Qingdao Shandong 266035, China;
    2. Troops 91286 of PLA, Qingdao Shandong 266003, China;
    3. PLA Information Engineering University, Zhengzhou Henan 450001, China;
    4. Troops 63850 of PLA, Baicheng Jilin 137001, China
  • Received:2020-12-21 Revised:2021-04-10 Online:2021-10-10 Published:2021-10-27

基于半监督子空间聚类的协议识别方法

朱玉娜1, 张玉涛2, 闫少阁3, 范钰丹3, 陈韩托4   

  1. 1. 中国人民解放军91033部队, 山东 青岛 266035;
    2. 中国人民解放军91286部队, 山东 青岛 266003;
    3. 中国人民解放军信息工程大学, 郑州 450001;
    4. 中国人民解放军63850部队, 吉林 白城 137001
  • 通讯作者: 朱玉娜
  • 作者简介:朱玉娜(1985-),女,山东菏泽人,工程师,博士,主要研究方向:密码协议逆向与识别;张玉涛(1981-),男,安徽亳州人,工程师,硕士,主要研究方向:信息安全;闫少阁(1983-),男,河南周口人,研究员,硕士,主要研究方向:信息安全;范钰丹(1982-),女,河南邓州人,讲师,硕士,主要研究方向:密码协议形式化分析与自动化验证;陈韩托(1990-),男,浙江奉化人,助理工程师,硕士,主要研究方向:协议在线安全性分析。

Abstract: The differences between different protocols are not considered when selecting identification features in the existing statistical feature-based identification methods. In order to solve the problem, a Semi-supervised Subspace-clustering Protocol Identification Approach (SSPIA) was proposed by combining semi-supervised learning and Fuzzy Subspace Clustering (FSC) method. Firstly, the prior constraint condition was obtained by transforming the labeled sample flow into pairwise constraints information. Secondly, the Semi-supervised Fuzzy Subspace Clustering (SFSC) algorithm was proposed on this basis and was used to guide the process of subspace clustering by using the constraint condition. Then, the mapping between class clusters and protocol types was established to obtain the weight coefficient of each protocol feature, and an individualized cryptographic protocol feature library was constructed for subsequent protocol identification. Finally, the clustering effect and identification effect experiments of five typical cryptographic protocols were carried out. Experimental results show that, compared with the traditional K-means method and FSC method, the proposed SSPIA has better clustering effect, and the protocol identification classifier constructed by SSPIA is more accurate, has higher protocol identification rate and lower error identification rate. The proposed SSPIA improves the identification effect based on statistical features.

Key words: cryptographic protocol, protocol identification, statistical feature, semi-supervised learning, subspace clustering

摘要: 针对现有的基于统计特征的协议识别方法选择识别特征时未考虑不同协议个体之间的差异的问题,结合半监督学习和模糊子空间聚类(FSC)方法,提出了一种半监督子空间聚类协议识别方法(SSPIA)。首先,将有标签的样本流转化为成对约束信息,从而获取先验约束条件;其次,在此基础上提出半监督模糊子空间聚类(SFSC)算法,该算法利用约束条件指导子空间聚类过程;然后,建立类簇和协议类型的映射,以获取协议各个特征的权重系数,进而构建个体化的密码协议特征库用于后续协议识别;最后,针对5个典型的密码协议进行聚类效果和识别效果实验。实验结果表明,针对基于统计特征的协议识别问题,与传统K-means方法和FSC方法相比,所提SSPIA的聚类效果更好,且SSPIA构建的协议识别分类器更为精确,协议识别率更高,误识别率更低。所提SSPIA提高了基于统计特征的识别效果。

关键词: 密码协议, 协议识别, 统计特征, 半监督学习, 子空间聚类

CLC Number: