计算机应用 ›› 2019, Vol. 39 ›› Issue (2): 488-493.DOI: 10.11772/j.issn.1001-9081.2018071449

• 网络空间安全 • 上一篇    下一篇

基于内容挖掘的广域信息管理系统业务数据安全

马兰1, 王京杰2, 陈焕2   

  1. 1. 中国民航大学 空中交通管理学院, 天津 300300;
    2. 中国民航大学 电子信息与自动化学院, 天津 300300
  • 收稿日期:2018-07-17 修回日期:2018-09-25 出版日期:2019-02-10 发布日期:2019-02-15
  • 通讯作者: 马兰
  • 作者简介:马兰(1966-),女,甘肃武威人,教授,博士,主要研究方向:空中交通管理信息与控制;王京杰(1993-),男,浙江义乌人,硕士研究生,主要研究方向:航空电信网与信息安全;陈焕(1995-),女,河北保定人,硕士,主要研究方向:空中交通管理信息安全。
  • 基金资助:
    国家自然基金委员会与中国民航局联合基金项目(U1533107);国家自然科学青年基金资助项目(61601467);天津市自然科学基金重点项目(17JCZDJC30900);2018年中央高校基本科研业务费资助项目(3122018D007)。

Business data security of system wide information management based on content mining

MA Lan1, WANG Jingjie2, CHEN Huan2   

  1. 1. School of Air Traffic Management, Civil Aviation University of China, Tianjin 300300, China;
    2. College of Electronic Information and Automation, Civil Aviation University of China, Tianjin 300300, China
  • Received:2018-07-17 Revised:2018-09-25 Online:2019-02-10 Published:2019-02-15
  • Supported by:
    This work is partially supported by the joint funds of National Natural Science Foundation of China and Civi Aviation Administration of China (U1533107), the National Science Foundation for Young Scientists of China (61601467), the Key Program of Natural Science Foundation of Tianjin (17JCZDJC30900), the Fundamental Research Funds for the Central Universities of China (3122018D007).

摘要: 针对广域信息管理系统(SWIM)服务共享中的数据安全问题,分析了SWIM业务流程中的安全隐患,提出了一种基于潜在狄利克雷分配(LDA)主题模型和内容挖掘的恶意数据的过滤方法。首先对SWIM四种业务数据进行大数据分析,然后通过LDA模型对业务数据进行特征抽取完成内容挖掘,最后利用KMP匹配算法在主串中查找模式串,从而检测出含有恶意关键字的SWIM业务数据。在Linux内核中对该检测方法进行测试,实验结果表明该方法能够有效地对SWIM业务数据进行内容挖掘,与潜在语义分析(LSA)和基于概率统计的潜在语义分析(pLSA)的方法相比也具有更好的检测性能。

关键词: 内容挖掘, 关键字匹配, 特征匹配, 广域信息管理系统, 业务数据

Abstract: Considering the data security problems of service sharing in SWIM (System Wide Information Management), the risks in the SWIM business process were analyzed, and a malicious data filtering method based on Latent Dirichlet Allocation (LDA) topic model and content mining was proposed. Firstly, big data analysis was performed on four kinds of SWIM business data, then LDA model was used for feature extraction of business data to realize content mining. Finally, the pattern string was searched in the main string by using KMP (Knuth-Morris-Pratt) matching algorithm to detect SWIM business data containing malicious keywords. The proposed method was tested in the Linux kernel. The experimental results show that the proposed method can effectively mine the content of SWIM business data and has better detection performance than other methods.

Key words: content mining, keyword matching, feature matching, SWIM (System Wide Information Management), business data

中图分类号: