计算机应用 ›› 2015, Vol. 35 ›› Issue (3): 797-801.DOI: 10.11772/j.issn.1001-9081.2015.03.797

• 人工智能 • 上一篇    下一篇

基于最大化依赖的恐怖行为背景特征提取方法

薛安荣1, 贾小艳1, 葛清龙1, 杨晓琴2   

  1. 1. 江苏大学 计算机科学与通信工程学院, 江苏 镇江 212013;
    2. 解放军理工大学 通信工程学院, 南京 210007
  • 收稿日期:2014-10-08 修回日期:2014-11-24 出版日期:2015-03-10 发布日期:2015-03-13
  • 通讯作者: 贾小艳
  • 作者简介:薛安荣(1964-),男,江苏镇江人,教授,博士,CCF会员,主要研究方向:数据挖掘、机器学习;贾小艳(1990-),女,河南南阳人,硕士研究生,主要研究方向:数据挖掘;葛清龙(1988-),男,江西赣州人,硕士研究生,主要研究方向:数据挖掘;杨晓琴(1986-),女,安徽安庆人,助教,硕士,主要研究方向:网络性能分析、网络安全
  • 基金资助:

    国家自然科学基金资助项目(61300228)

Context feature extraction method of terrorism behavior based on dependence maximization

XUE Anrong1, JIA Xiaoyan1, GE Qinglong1, YANG Xiaoqin2   

  1. 1. School of Computer Science and Communications Engineering, Jiangsu University, Zhenjiang Jiangsu 212013, China;
    2. Institute of Communication Engineering, PLA University of Science and Technology, Nanjing Jiangsu 210007, China
  • Received:2014-10-08 Revised:2014-11-24 Online:2015-03-10 Published:2015-03-13

摘要:

针对恐怖数据集中存在的属性值残缺问题,提出了基于最大化背景向量与行为之间依赖关系的压缩背景空间(CCS)方法。该方法基于希尔伯特-施密特独立标准和希尔伯特-施密特范数,它们能有效检测变量间的关联性。CCS通过使得背景向量线性投影后的低维特征与行为之间希尔伯特-施密特范数最大化,从而实现背景向量与行为之间的依赖关系最大化,更好地发现两者之间的关联性,减小属性值残缺带来的影响。然后利用分类模型(如支持向量机(SVM))对所得到的低维特征进行学习(CCS+SVM),实现高效预测。在MAROB数据集上的实验表明:与SVM模型、基于传统特征提取方法(如PCA和CCA)的SVM模型以及已有的恐怖行为预测算法CONVEX相比,CCS+SVM的性能在查全率和F值上分别提高1.5%和1.0%以上,而查准率和ROC曲线下面积(AUC)值与最好性能相当。实验结果表明,CCS+SVM能够较好地解决恐怖数据集中的属性值残缺问题。

关键词: 恐怖行为预测, 特征提取, 希尔伯特-施密特独立标准, 支持向量机, 恐怖组织行为族群

Abstract:

To combat the missing value problem in terrorism behavior data set, this paper proposed Compressed Context Space (CCS) method which is based on the idea of maximizing the dependence between the context vectors and actions. CCS relied on Hilbert-Schmidt independence criterion which evaluated the relationship between two variables according to their Hilbert-Schmidt norm. Theories have proven Hilbert-Schmidt norm can detect dependence. In order to detect the relevance well and maximum the dependence between the context features and actions, CCS should maximum Hilbert-Schmidt norm between the linearly mapped low-dimensional features and actions, which is able to reduce the effect of missing value problem. Combining CCS followed SVM (CCS) can produce effective classification. Experiments on MAROB show that the proposed CCS+SVM improves SVM, PCA+SVM, CCA+SVM and CONVEX by at least 1.5% and 1.0% for recall and F measure, and has competitive performance with the best results for precision and Area Under ROC Curve (AUC). The results show that CCS+SVM handles missing value problem well.

Key words: terrorism behavior prediction, feature extraction, Hilbert-Schmidt independence criterion, Support Vector Machine (SVM), Minorities at Risk Organizational Behavior (MOROB)

中图分类号: