基于最大化依赖的恐怖行为背景特征提取方法

doi:10.11772/j.issn.1001-9081.2015.03.797

计算机应用 ›› 2015, Vol. 35 ›› Issue (3): 797-801.DOI: 10.11772/j.issn.1001-9081.2015.03.797

基于最大化依赖的恐怖行为背景特征提取方法

薛安荣¹, 贾小艳¹, 葛清龙¹, 杨晓琴²

1. 江苏大学计算机科学与通信工程学院, 江苏镇江 212013;
2. 解放军理工大学通信工程学院, 南京 210007

收稿日期:2014-10-08 修回日期:2014-11-24 出版日期:2015-03-10 发布日期:2015-03-13
通讯作者: 贾小艳
作者简介:薛安荣(1964-),男,江苏镇江人,教授,博士,CCF会员,主要研究方向:数据挖掘、机器学习;贾小艳(1990-),女,河南南阳人,硕士研究生,主要研究方向:数据挖掘;葛清龙(1988-),男,江西赣州人,硕士研究生,主要研究方向:数据挖掘;杨晓琴(1986-),女,安徽安庆人,助教,硕士,主要研究方向:网络性能分析、网络安全
基金资助:
国家自然科学基金资助项目(61300228)

Context feature extraction method of terrorism behavior based on dependence maximization

XUE Anrong¹, JIA Xiaoyan¹, GE Qinglong¹, YANG Xiaoqin²

1. School of Computer Science and Communications Engineering, Jiangsu University, Zhenjiang Jiangsu 212013, China;
2. Institute of Communication Engineering, PLA University of Science and Technology, Nanjing Jiangsu 210007, China

Received:2014-10-08 Revised:2014-11-24 Online:2015-03-10 Published:2015-03-13

摘要/Abstract

摘要：

针对恐怖数据集中存在的属性值残缺问题,提出了基于最大化背景向量与行为之间依赖关系的压缩背景空间(CCS)方法。该方法基于希尔伯特-施密特独立标准和希尔伯特-施密特范数,它们能有效检测变量间的关联性。CCS通过使得背景向量线性投影后的低维特征与行为之间希尔伯特-施密特范数最大化,从而实现背景向量与行为之间的依赖关系最大化,更好地发现两者之间的关联性,减小属性值残缺带来的影响。然后利用分类模型(如支持向量机(SVM))对所得到的低维特征进行学习(CCS+SVM),实现高效预测。在MAROB数据集上的实验表明:与SVM模型、基于传统特征提取方法(如PCA和CCA)的SVM模型以及已有的恐怖行为预测算法CONVEX相比,CCS+SVM的性能在查全率和F值上分别提高1.5%和1.0%以上,而查准率和ROC曲线下面积(AUC)值与最好性能相当。实验结果表明,CCS+SVM能够较好地解决恐怖数据集中的属性值残缺问题。

关键词: 恐怖行为预测, 特征提取, 希尔伯特-施密特独立标准, 支持向量机, 恐怖组织行为族群

Abstract:

To combat the missing value problem in terrorism behavior data set, this paper proposed Compressed Context Space (CCS) method which is based on the idea of maximizing the dependence between the context vectors and actions. CCS relied on Hilbert-Schmidt independence criterion which evaluated the relationship between two variables according to their Hilbert-Schmidt norm. Theories have proven Hilbert-Schmidt norm can detect dependence. In order to detect the relevance well and maximum the dependence between the context features and actions, CCS should maximum Hilbert-Schmidt norm between the linearly mapped low-dimensional features and actions, which is able to reduce the effect of missing value problem. Combining CCS followed SVM (CCS) can produce effective classification. Experiments on MAROB show that the proposed CCS+SVM improves SVM, PCA+SVM, CCA+SVM and CONVEX by at least 1.5% and 1.0% for recall and F measure, and has competitive performance with the best results for precision and Area Under ROC Curve (AUC). The results show that CCS+SVM handles missing value problem well.

Key words: terrorism behavior prediction, feature extraction, Hilbert-Schmidt independence criterion, Support Vector Machine (SVM), Minorities at Risk Organizational Behavior (MOROB)

中图分类号:

TP181

薛安荣, 贾小艳, 葛清龙, 杨晓琴. 基于最大化依赖的恐怖行为背景特征提取方法[J]. 计算机应用, 2015, 35(3): 797-801.

XUE Anrong, JIA Xiaoyan, GE Qinglong, YANG Xiaoqin. Context feature extraction method of terrorism behavior based on dependence maximization[J]. Journal of Computer Applications, 2015, 35(3): 797-801.

参考文献

[1] SUBRAHMANIAN V S, ALBANESE M, MARTINEZ M V. CARA: a cultural-reasoning architecture[J]. IEEE Intelligent Systems, 2007,22(2):12-16.
[2] JOHN E E, CHAINEY S, GAMERON J G, et al. Finding most probable worlds of probabilistic logic programs[C]//Proceedings of the 1st International Conference on Scalable Uncertainty Management. Berlin: Springer-Verlag, 2007:45-57.
[3] MARTINEZ V, SIMARI G I, SLIVA A, et al. CONVEX: similarity-based algorithms for forecasting group behavior[J]. IEEE Intelligent Systems, 2008,23(4):51-57.
[4] VICTOR A, PATE A, WILKENFELD J. Minorities at risk organizational behavior data and codebook version 9[EB/OL].[2014-09-15]. http://www.cidcm.umd.edu/mar/data.asp.
[5] HAN J, KAMBER M. Data mining concepts and techniques[M]. 3rd ed. San Mateo: Morgan Kaufmann, 2011:39-82.
[6] GRZYMALA-BUSSE J W, HU M. A comparison of several approaches to missing attribute values in data mining[C]//Proceedings of the 2nd International Conference on Rough Sets and Current Trends in Computing. Berlin: Springer-Verlag, 2001:378-385.
[7] OLGN T, MICHAEL C, SHERLOCK G, et al. Missing value estimation methods for DNA microarrays[J]. Bioinformatics, 2001,17(6):520-525.
[8] OBA S. A Bayesian missing value estimation method for gene expression profile data[J]. Bioinformatics, 2003,19(16):2088-2096.
[9] KIM H, GOLUB G H, PARK H. Missing value estimation for DNA microarray gene expression data: local least squares imputation[J]. Bioinformatics, 2005,21(1):187-198.
[10] SCHOLKOPF B, SMOLA A, MULLER K R. Nonlinear component analysis as a kernel eigenvalue problem[J]. Neural Computation, 1998,10(5):1299-1319.
[11] TENENBAUM J B, SILVA V D, LANGFORD J C. A global geometric framework for nonlinear dimensionality reduction[J]. Science, 2000,290(22):2319-2323.
[12] ROWEIS S T, SAUL L K. Nonlinear dimensionality reduction by locally linear embedding[J]. Science, 2000,290(22):2323-2326.
[13] BACH F R, JORDAN M I. Kernel independent component analysis[J]. Journal of Machine Learning Research, 2002,3(7):1-48.
[14] WEI L, XU F. Local CCA alignment and its applications[J]. Neurocomputing, 2012,89(7):78-88.
[15] SONG L, SMOLA A, GRETTON A, et al. Supervised feature selection via dependence estimation[C]//Proceedings of the 24th International Conference on Machine Learning. New York: ACM, 2007:823-830.
[16] SONG L, SMOLA A, GRETTON A, et al. Feature selection via dependence maximization[J]. Journal of Machine Learning Research, 2012,13(5):1393-1434.
[17] CHANG C C, LIN C J. LIBSVM: a library for supporting vector machines[J]. ACM Transactions on Intelligent Systems and Technology, 2011,2(3):1-24.

基于最大化依赖的恐怖行为背景特征提取方法

Context feature extraction method of terrorism behavior based on dependence maximization

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

[1]	郑志强, 胡鑫, 翁智, 王雨禾, 程曦. 基于改进DenseNet的牛眼图像特征提取方法[J]. 计算机应用, 2021, 41(9): 2780-2784.
[2]	贾鹤鸣, 姜子超, 李瑶, 孙康健. 基于改进斑点鬣狗优化算法的同步优化特征选择[J]. 计算机应用, 2021, 41(5): 1290-1298.
[3]	袁芊芊, 邓洪敏, 王晓航. 基于超像素快速模糊C均值聚类与支持向量机的柑橘病虫害区域分割[J]. 计算机应用, 2021, 41(2): 563-570.
[4]	佘玉龙, 张晓龙, 程若勤, 邓春华. 基于边缘关注模型的语义分割方法[J]. 计算机应用, 2021, 41(2): 343-349.
[5]	李凯, 李洁. 基于pinball损失的结构模糊多分类支持向量机算法[J]. 《计算机应用》唯一官方网站, 2021, 41(11): 3104-3112.
[6]	赵津, 宋文爱, 邰隽, 杨吉江, 王青, 李晓丹, 雷毅, 邱悦. 儿童阻塞性睡眠呼吸暂停计算机人脸辅助诊断综述[J]. 《计算机应用》唯一官方网站, 2021, 41(11): 3394-3401.
[7]	朱新成, 何坤金, 倪娜, 郝博. 基于改进迭代最近点算法的接骨板贴合性快捷计算方法[J]. 计算机应用, 2021, 41(10): 3033-3039.
[8]	童林, 官铮. 改进鲸鱼优化支持向量机的交通流量模糊粒化预测[J]. 计算机应用, 2021, 41(10): 2919-2927.
[9]	陆荣秀, 陈明明, 杨辉, 朱建勇. 基于溶液图像时序特征的元素组分含量动态监测系统[J]. 计算机应用, 2021, 41(10): 3075-3081.
[10]	尹春勇, 何苗. 基于改进胶囊网络的文本分类[J]. 计算机应用, 2020, 40(9): 2525-2530.
[11]	周云, 陈淑荣. 基于双流非局部残差网络的行为识别方法[J]. 计算机应用, 2020, 40(8): 2236-2240.
[12]	张家岗, 李达平, 杨晓东, 邹茂扬, 吴锡, 胡金蓉. 基于深度卷积特征光流的形变医学图像配准算法[J]. 计算机应用, 2020, 40(6): 1799-1805.
[13]	张健铭, 施元昊, 徐正蓺, 魏建明. 基于误差预测的自适应UWB/PDR融合定位算法[J]. 计算机应用, 2020, 40(6): 1755-1762.
[14]	徐代, 岳璋, 杨文霞, 任潇. 基于改进的三向流Faster R-CNN的篡改图像识别[J]. 计算机应用, 2020, 40(5): 1315-1321.
[15]	郭志强, 胡永武, 刘鹏, 杨杰. 基于特征融合的室外天气图像分类[J]. 计算机应用, 2020, 40(4): 1023-1029.