基于三元中心引导的弱监督视频异常检测

doi:10.11772/j.issn.1001-9081.2023050748

《计算机应用》唯一官方网站 ›› 2024, Vol. 44 ›› Issue (5): 1452-1457.DOI: 10.11772/j.issn.1001-9081.2023050748

所属专题：人工智能； 2023年中国计算机学会人工智能会议(CCFAI 2023)

• 2023年中国计算机学会人工智能会议(CCFAI 2023) • 上一篇下一篇

基于三元中心引导的弱监督视频异常检测

朱子蒙¹, 李志新², 郇战²(), 陈瑛², 梁久祯¹

^1.常州大学计算机与人工智能学院、阿里云大数据学院、软件学院, 江苏常州 213164
^2.常州大学微电子与控制工程学院, 江苏常州 213164

收稿日期:2023-05-16 修回日期:2023-06-19 接受日期:2023-07-04 发布日期:2023-08-01 出版日期:2024-05-10
通讯作者: 郇战
作者简介:朱子蒙（1999—），男，甘肃兰州人，硕士研究生，CCF会员，主要研究方向：数据挖掘、异常检测、计算机视觉
李志新（1993—），女，河南洛阳人，博士，CCF会员，主要研究方向：模式识别、计算成像和相位提取
陈瑛（1987—），女，江苏常州人，讲师，博士，CCF会员，主要研究方向：数据挖掘、模式识别、物联网智能传感器
梁久祯（1968—），男，山东泰安人，教授，博士生导师，博士，CCF会员，主要研究方向：计算机视觉、数据挖掘、模式识别、深度学习。
第一联系人：郇战（1969—），男，陕西咸阳人，教授，博士生导师，博士，CCF会员，主要研究方向：数据挖掘、物联网智能传感器、智能感知
基金资助:
国家自然科学基金资助项目(62201093)

Weakly supervised video anomaly detection based on triplet-centered guidance

Zimeng ZHU¹, Zhixin LI², Zhan HUAN²(), Ying CHEN², Jiuzhen LIANG¹

^1.School of Computer Science and Artificial Intelligence，Aliyun School of Big Data，School of Software，Changzhou University，Changzhou Jiangsu 213164，China
^2.School of Microelectronics and Control Engineering，Changzhou University，Changzhou Jiangsu 213164，China

Received:2023-05-16 Revised:2023-06-19 Accepted:2023-07-04 Online:2023-08-01 Published:2024-05-10
Contact: Zhan HUAN
About author:ZHU Zimeng， born in 1999， M. S. candidate. His research interests include data mining， anomaly detection， computer vision.
LI Zhixing， born in 1993， Ph. D. Her research interests include pattern recognition， computational imaging and phase retrieval.
CHEN Ying， born in 1987， Ph. D.， lecturer. Her research interests include data mining， pattern recognition， IoT smart sensor.
LIANG Jiuzhen， born in 1968， Ph. D.， professor. His research interests include computer vision， data mining， pattern recognition， deep learning.
Supported by:
National Natural Science Foundation of China(62201093)

摘要/Abstract

摘要：

针对监控视频异常的复杂多样性和短时持续性，引入弱监督视频异常检测方法，旨在仅使用视频级别的标签进行异常检测，并提出了基于变分自编码器（VAE）与长短期记忆（LSTM）网络的异常回归网络VLARNet作为异常检测框架，以捕获时序数据中的时间依赖关系、去除冗余信息，保留数据的关键信息。该框架将异常检测视为回归问题，为学习检测特征，设计了异常分数回归的三元中心损失（TCLASR），与动态多实例学习损失（DMIL）相结合以进一步提高特征的区分能力。DMIL能够扩大异常实例与正常实例之间的类间距离，但同时也扩大了类内距离，而TCLASR可使来自同类的实例与类中心的距离更接近，与不同类中心的距离更远。对VLARNet在ShanghaiTech与CUHK Avenue数据集上进行了综合实验。实验结果表明，VLARNet能够有效利用视频数据的各种信息，在两个数据集上获得的受试者工作特征曲线下面积（AUC）分别为94.64%和93.00%，明显优于对比算法。

关键词: 异常检测, 弱监督学习, 多实例学习, 中心损失, 受试者工作特征曲线下面积

Abstract:

In view of the complex diversity and short time persistence of surveillance video anomaly， a weakly supervised video abnormal detection method was introduced to detect anomalies by only using video-level tags， and an anomaly regression network VLARNet based on Variational AutoEncoder （VAE） and Long Short-Term Memory （LSTM） network was proposed as an anomaly detection framework to effectively capture the temporal dependencies in time series data， eliminate redundant information and retain key information in the data. Anomaly detection was considered as a regression problem by VLARNet. To learn detection features， a Triplet-Centered Loss for Anomaly Score Regression （TCLASR） was designed and combined with Dynamic Multiple Instance Learning loss （DMIL） to further improve the discrimination ability of features. The DMIL widened the inter-class distance between abnormal instances and normal instances， but it also widened the intra-class distance. The TCLASR made the distances between the instances in the same class and the center closer and the distances between instances in different classes and the center farther. The proposed VLARNet was comprehensively tested on ShanghaiTech and CUHK Avenue datasets. Experimental results show that VLARNet can effectively utilize various information in video data， achieving Area Under receiver operating characteristic Curve （AUC） of 94.64% and 93.00% respectively on the two datasets， which is significantly better than those of the comparison algorithms.

Key words: anomaly detection, weakly supervised learning, multiple instance learning, center loss, Area Under receiver operating characteristic Curve (AUC)

中图分类号:

TP183

朱子蒙, 李志新, 郇战, 陈瑛, 梁久祯. 基于三元中心引导的弱监督视频异常检测[J]. 计算机应用, 2024, 44(5): 1452-1457.

Zimeng ZHU, Zhixin LI, Zhan HUAN, Ying CHEN, Jiuzhen LIANG. Weakly supervised video anomaly detection based on triplet-centered guidance[J]. Journal of Computer Applications, 2024, 44(5): 1452-1457.

图/表 11

图1 Encoder与Decoder模块结构

Fig. 1 Module structures of Encoder and Decoder

图2 VLARNet模型框架

Fig. 2 Framework of VLARNet model

图3 通过中心损失和TCLASR学习到的深度特征的分布图

Fig. 3 Distribution graphs of deep features learned by center loss and TCLASR

表1 本文方法与现有方法的性能对比 ( %)

Tab. 1 Performance comparison between proposed method and existing methods

数据集	方法来源	AUC	FAR
ShanghaiTech	文献［6］	86.30	0.15
	文献［7］	82.50	0.10
	文献［9］	84.44	0.18
	文献［8］	91.24	0.10
	文献［22］	84.90	—
	本文	94.64	0.12
CHUK Avenue	文献［22］	90.40	—
	文献［23］	91.51	—
	文献［24］	92.30	—
	本文	93.00	0.08

图4 本文方法的ROC曲线

Fig. 4 ROC curves of proposed method

表2 使用不同损失函数的AUC及FAR对比 ( %)

Tab. 2 Comparison of AUCs and FARs with different loss functions

损失函数	AUC	FAR
$L D M I L$ （基线）	87.61	4.59
$L D M I L + L C$	90.82	0.63
$L D M I L + L T C L$	94.64	0.12

表2 使用不同损失函数的AUC及FAR对比 ( %)

Tab. 2 Comparison of AUCs and FARs with different loss functions

损失函数	AUC	FAR
$L D M I L$ （基线）	87.61	4.59
$L D M I L + L C$	90.82	0.63
$L D M I L + L T C L$	94.64	0.12

表3 使用不同超参数λ的AUC和FAR对比

Tab. 3 Comparison of AUC and FAR with different hyperparameter λ

$λ$	AUC/%	FAR/%	$λ$	AUC/%	FAR/%
10⁰	89.13	0.20	10^-3	93.95	0.15
10^-1	91.82	0.24	10^-4	88.46	0.60
10^-2	94.64	0.12	0	87.61	4.59

表3 使用不同超参数λ的AUC和FAR对比

Tab. 3 Comparison of AUC and FAR with different hyperparameter λ

$λ$	AUC/%	FAR/%	$λ$	AUC/%	FAR/%
10⁰	89.13	0.20	10^-3	93.95	0.15
10^-1	91.82	0.24	10^-4	88.46	0.60
10^-2	94.64	0.12	0	87.61	4.59

表4 使用不同超参数m的AUC和FAR对比

Tab. 4 Comparison of AUC and FAR with different hyperparameter m

$m$	AUC/%	FAR/%	$m$	AUC/%	FAR/%
0	89.85	0.16	5	94.64	0.12
1	91.82	0.24	10	93.95	0.15

表4 使用不同超参数m的AUC和FAR对比

Tab. 4 Comparison of AUC and FAR with different hyperparameter m

$m$	AUC/%	FAR/%	$m$	AUC/%	FAR/%
0	89.85	0.16	5	94.64	0.12
1	91.82	0.24	10	93.95	0.15

表5 使用不同超参数η的AUC

Tab. 5 AUC with different hyperparameter η

$η$	AUC/%	$η$	AUC/%
2	94.01	8	81.94
4	94.64	10	64.25
6	90.83

表5 使用不同超参数η的AUC

Tab. 5 AUC with different hyperparameter η

$η$	AUC/%	$η$	AUC/%
2	94.01	8	81.94
4	94.64	10	64.25
6	90.83

表6 使用不同特征提取器时的AUC ( %)

Tab. 6 AUC with different feature extractors

特征提取器	AUC
使用RGB特征的I3D网络	91.50
使用光流特征的I3D网络	86.44
使用混合特征的I3D网络	94.64

图5 模型测试结果可视化比较

Fig. 5 Visual comparison of model testing results

参考文献 24

1	HINAMI R， MEI T， SATOH S. Joint detection and recounting of abnormal events by learning deep generic knowledge［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017： 3619-3627. 10.1109/iccv.2017.391
2	LIU W， LUO W， LIAN D， et al. Future frame prediction for anomaly detection： a new baseline［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 6536-6545. 10.1109/cvpr.2018.00684
3	PANG G， YAN C， SHEN C， et al. Self-trained deep ordinal regression for end-to-end video anomaly detection［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 12173-12182. 10.1109/cvpr42600.2020.01219
4	RAVANBAKHSH M， SANGINETO E， NABI M， et al. Training adversarial discriminators for cross-channel abnormal event detection in crowds［C］// Proceedings of the 2019 IEEE Winter Conference on Applications of Computer Vision. Piscataway： IEEE， 2019： 1896-1904. 10.1109/wacv.2019.00206
5	DIETTERICH T G， LATHROP R H， LOZANO-PÉREZ T. Solving the multiple instance problem with axis-parallel rectangles［J］. Artificial Intelligence， 1997， 89（1/2）： 31-71. 10.1016/s0004-3702(96)00034-3
6	SULTANI W， CHEN C， SHAH M. Real-world anomaly detection in surveillance videos［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 6479-6488. 10.1109/cvpr.2018.00678
7	ZHANG J， QING L， MIAO J. Temporal convolutional network with complementary inner bag loss for weakly supervised anomaly detection［C］// Proceedings of the 2019 IEEE International Conference on Image Processing. Piscataway： IEEE， 2019： 4030-4034. 10.1109/icip.2019.8803657
8	WAN B， FANG Y， XIA X， et al. Weakly supervised video anomaly detection via center-guided discriminative learning［C］// Proceedings of the 2020 IEEE International Conference on Multimedia and Expo. Piscataway： IEEE， 2020： 1-6. 10.1109/icme46284.2020.9102722
9	ZHONG J-X， LI N， KONG W， et al. Graph convolutional label noise cleaner： train a plug-and-play action classifier for anomaly detection［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2019： 1237-1246. 10.1109/cvpr.2019.00133
10	CARREIRA J， ZISSERMAN A. Quo Vadis， action recognition？ A new model and the kinetics dataset［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 6299-6308. 10.1109/cvpr.2017.502
11	KINGMA D P， WELLING M. Auto-encoding variational Bayes［EB/OL］. ［2023-05-01］. . 10.1561/2200000056
12	GRAVES A. Long short-term memory［C］// Supervised Sequence Labelling with Recurrent Neural Networks. Berlin： Springer， 2012： 37-45. 10.1007/978-3-642-24797-2_4
13	YAO R， LIU C， ZHANG L， et al. Unsupervised anomaly detection using variational auto-encoder based feature extraction［C］// Proceedings of the 2019 IEEE International Conference on Prognostics and Health Management. Piscataway： IEEE， 2019： 1-7. 10.1109/icphm.2019.8819434
14	WEN Y， ZHANG K， LI Z， et al. A discriminative feature learning approach for deep face recognition［C］// Proceedings of the 14th European Conference on Computer Vision. Cham： Springer， 2016： 499-515. 10.1007/978-3-319-46478-7_31
15	SCHROFF F， KALENICHENKO D， PHILBIN J. FaceNet： a unified embedding for face recognition and clustering［C］// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Washington， DC： IEEE Computer Society， 2015： 815-823. 10.1109/cvpr.2015.7298682
16	ZHANG Y， ZHOU D， CHEN S， et al. Single-image crowd counting via multi-column convolutional neural network［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 589-597. 10.1109/cvpr.2016.70
17	LU C， SHI J， JIA J. Abnormal event detection at 150 FPS in MATLAB［C］// Proceedings of the 2013 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2013： 2720-2727. 10.1109/iccv.2013.338
18	STEINBRÜCKER F， POCK T， CREMERS D. Large displacement optical flow computation without warping［C］// Proceedings of the 2009 IEEE 12th International Conference on Computer Vision. Piscataway： IEEE， 2009： 1609-1614. 10.1109/iccv.2009.5459364
19	GLOROT X， BENGIO Y. Understanding the difficulty of training deep feedforward neural networks［C］// Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics. New York： JMLR， 2010： 249-256.
20	SRIVASTAVA N， HINTON G， KRIZHEVSKY A， et al. Dropout： a simple way to prevent neural networks from overfitting［J］. The Journal of Machine Learning Research， 2014， 15（1）： 1929-1958.
21	KINGMA D P， BA J. Adam： a method for stochastic optimization［EB/OL］. ［2023-05-01］. .
22	IONESCU R T， KHAN F S， M-I GEORGESCU， et al. Object-centric auto-encoders and dummy anomalies for abnormal event detection in video［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2019： 7842-7851. 10.1109/cvpr.2019.00803
23	M-I GEORGESCU， BARBALAU A， IONESCU R T， et al. Anomaly detection in video via self-supervised and multi-task learning［C］// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Piscataway： IEEE， 2021： 12742-12752. 10.1109/cvpr46437.2021.01255
24	GEORGESCU M I， IONESCU R T， KHAN F S， et al. A background-agnostic framework with adversarial training for abnormal event detection in video［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2022， 44（9）： 4505-4523.

[1]	陈廷伟, 张嘉诚, 王俊陆. 面向联邦学习的随机验证区块链构建[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2770-2776.
[2]	陈虹, 齐兵, 金海波, 武聪, 张立昂. 融合1D-CNN与BiGRU的类不平衡流量异常检测[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2493-2499.
[3]	刘禹含, 吉根林, 张红苹. 基于骨架图与混合注意力的视频行人异常检测方法[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2551-2557.
[4]	林欣蕊, 王晓菲, 朱焱. 基于局部扩展社区发现的学术异常引用群体检测[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1855-1861.
[5]	孟凡, 杨群力, 霍静, 王新宽. 基于边缘异常候选集的迭代式主动多元时序异常检测算法[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1458-1463.
[6]	赵培, 乔焰, 胡荣耀, 袁新宇, 李敏悦, 张本初. 基于多域特征提取的多变量时间序列异常检测[J]. 《计算机应用》唯一官方网站, 2024, 44(11): 3419-3426.
[7]	刘永江, 陈斌. 基于多尺度记忆库的像素级无监督工业异常检测[J]. 《计算机应用》唯一官方网站, 2024, 44(11): 3587-3594.
[8]	唐宇皓, 彭德中, 袁钟. 面向不完备混合数据的模糊多粒度异常检测[J]. 《计算机应用》唯一官方网站, 2024, 44(10): 3097-3104.
[9]	蒋辉, 闫秋艳, 姜竹郡. 面向多元时间序列异常检测的对称正定自编码器方法[J]. 《计算机应用》唯一官方网站, 2024, 44(10): 3294-3299.
[10]	叶力硕, 何志学. 融合小波分解的多尺度时间序列异常检测[J]. 《计算机应用》唯一官方网站, 2024, 44(10): 3300-3306.
[11]	张雨宁, 阿布都克力木·阿布力孜, 梅悌胜, 徐春, 麦尔达娜·买买提热依木, 哈里旦木·阿布都克里木, 侯钰涛. 基于自监督特征提取的骨骼X线影像异常检测方法[J]. 《计算机应用》唯一官方网站, 2024, 44(1): 175-181.
[12]	祁超帅, 何文思, 焦毅, 马英红, 蔡伟, 任素萍. 无人机飞行数据异常检测算法综述[J]. 《计算机应用》唯一官方网站, 2023, 43(6): 1833-1841.
[13]	许喆, 王志宏, 单存宇, 孙亚茹, 杨莹. 基于重构误差的无监督人脸伪造视频检测[J]. 《计算机应用》唯一官方网站, 2023, 43(5): 1571-1577.
[14]	姚英茂, 姜晓燕. 基于图卷积网络与自注意力图池化的视频行人重识别方法[J]. 《计算机应用》唯一官方网站, 2023, 43(3): 728-735.
[15]	孙杰, 吴绍鑫, 王学军, 华璟. 基于Sophon SC5+芯片构架的行人搜索算法与优化[J]. 《计算机应用》唯一官方网站, 2023, 43(3): 744-751.

基于三元中心引导的弱监督视频异常检测

Weakly supervised video anomaly detection based on triplet-centered guidance

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 11

参考文献 24

相关文章 15

编辑推荐

Metrics