基于独立循环神经网络与变分自编码网络的视频帧异常检测

doi:10.11772/j.issn.1001-9081.2021122081

《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (2): 507-513.DOI: 10.11772/j.issn.1001-9081.2021122081

• 多媒体计算与计算机仿真 • 上一篇

基于独立循环神经网络与变分自编码网络的视频帧异常检测

贾晴(), 王来花, 王伟胜

曲阜师范大学网络空间安全学院，山东曲阜 273165

收稿日期:2021-12-09 修回日期:2022-04-13 接受日期:2022-05-13 发布日期:2022-06-13 出版日期:2023-02-10
通讯作者: 贾晴
作者简介:王来花（1988—），女，山东聊城人，副教授，博士，主要研究方向：数字图像处理、视频异常检测
王伟胜（1997—），男，山东烟台人，硕士研究生，主要研究方向：视频异常检测、计算机视觉。
基金资助:
国家自然科学基金资助项目(61601261)

Anomaly detection in video via independently recurrent neural network and variational autoencoder network

Qing JIA(), Laihua WANG, Weisheng WANG

School of Cyber Science and Engineering，Qufu Normal University，Qufu Shandong 273165，China

Received:2021-12-09 Revised:2022-04-13 Accepted:2022-05-13 Online:2022-06-13 Published:2023-02-10
Contact: Qing JIA
About author:WANG Laihua， born in 1988， Ph. D.， associate professor. Her research interests include digital image processing， video anomaly detection.
WANG Weisheng， born in 1997， M. S. candidate. His research interests include video anomaly detection， computer vision.
Supported by:
National Natural Science Foundation of China(61601261)

摘要/Abstract

摘要：

为了有效提取连续视频帧间的时间信息，提出一种融合独立循环神经网络（IndRNN）与变分自编码（VAE）网络的预测网络IndRNN-VAE。首先，利用VAE网络提取视频帧的空间信息，并通过线性变换得到视频帧的潜在特征；然后，将潜在特征作为IndRNN的输入以得到视频帧序列的时间信息；最后，通过残差块将获得的潜在变量与时间信息进行融合并输入到解码网络中来生成预测帧。通过在UCSD Ped1、UCSD Ped2、Avenue公开数据集上进行测试，实验结果表明，与现有的异常检测方法相比，基于IndRNN-VAE的方法性能得到了显著提升，曲线下面积（AUC）值分别达到了84.3%、96.2%和86.6%，错误率（EER）值分别达到了22.7%、8.8%和19.0%，平均异常得分的差值分别达到了0.263、0.497和0.293，且运行速度达到了每秒28帧。

关键词: 视频异常检测, 视频监控, 变分自编码器, 独立循环神经网络, 特征提取

Abstract:

To effectively extract the temporal information between consecutive video frames， a prediction network IndRNN-VAE （Independently Recurrent Neural Network-Variational AutoEncoder） that fuses Independently Recurrent Neural Network （IndRNN） and Variational AutoEncoder （VAE） network was proposed. Firstly， the spatial information of video frames was extracted through VAE network， and the latent features of video frames were obtained by a linear transformation. Secondly， the latent features were used as the input of IndRNN to obtain the temporal information of the sequence of video frames. Finally， the obtained latent features and temporal information were fused through residual block and input to the decoding network to generate the prediction frame. By testing on UCSD Ped1， UCSD Ped2 and Avenue public datasets， experimental results show that compared with the existing anomaly detection methods， the method based on IndRNN-VAE has the performance significantly improved， and has the Area Under Curve （AUC） values reached 84.3%， 96.2%， and 86.6% respectively， the Equal Error Rate （EER） values reached 22.7%， 8.8%， and 19.0% respectively， the difference values in the mean anomaly scores reached 0.263， 0.497， and 0.293 respectively. Besides， the running speed of this method reaches 28 FPS （Frames Per Socond）.

Key words: video anomaly detection, video surveillance, Variational AutoEncoder (VAE), Independently Recurrent Neural Network (IndRNN), feature extraction

中图分类号:

TP391.41

贾晴, 王来花, 王伟胜. 基于独立循环神经网络与变分自编码网络的视频帧异常检测[J]. 计算机应用, 2023, 43(2): 507-513.

Qing JIA, Laihua WANG, Weisheng WANG. Anomaly detection in video via independently recurrent neural network and variational autoencoder network[J]. Journal of Computer Applications, 2023, 43(2): 507-513.

图/表 14

图1 视频异常检测整体流程

Fig. 1 Overall flowchart of video anomaly detection

图2 IndRNN-VAE网络模型

Fig. 2 IndRNN-VAE network model

图3 循环IndRNN

Fig. 3 Cyclically Independently recurrent neural network

图4 模型训练结构示意图

Fig. 4 Schematic diagram of model training structure

图5 不同数据集的异常事件示例

Fig. 5 Examples of abnormal events in different datasets

表1 相关异常检测方法的AUC值和EER值对比 ( %)

Tab. 1 AUC value and EER value comparison of related abnormal detection methods

方法	类型	Ped1		Ped2		Avenue
方法	类型	AUC	EER	AUC	EER	AUC	EER
Conv-AE^［9］	帧重构	75.0	27.9	85.0	21.7	80.0	23.0
Unmask^［10］	帧重构	68.4	—	82.2	—	80.6	—
FP^［11］	帧预测	83.1	—	95.4	—	84.9	—
AD^［12］	帧预测	83.9	—	96.0	—	86.0	—
GMFC-VAE^［13］	帧重构	94.9	11.3	92.2	12.6	83.4	22.7
R-STAE^［14］	帧重构	—	—	83.0	—	82.0	—
R-VAE^［16］	帧重构	75.0	32.4	91.0	15.5	79.6	27.5
本文方法	帧预测	84.3	22.7	96.2	8.8	86.6	19.0

表2 相关异常检测方法的时间性能对比

Tab. 2 Time performance comparison of related abnormal detection methods

方法	FPS	方法	FPS
Unmask^［10］	20	R-STAE^［14］	14
FP^［11］	25	本文方法	28

表3 不同数据集上的差值ΔS对比

Tab. 3 Difference value ΔS comparison on different datasets

方法	Ped1	Ped2	Avenue
Conv-AE^［9］	0.243	0.384	0.256
FP^［11］	0.259	0.469	0.275
本文方法	0.263	0.497	0.293

图6 Ped1数据集上视频帧的异常分数

Fig. 6 Anomaly scores of video frames on Ped1 dataset

图7 Ped2数据集上视频帧的异常分数

Fig. 7 Anomaly scores of video frames on Ped2 dataset

图8 Avenue数据集上视频帧的异常分数

Fig. 8 Anomaly scores of video frames on Avenue dataset

图9 Ped2数据集上不同层数的AUC值

Fig. 9 AUC values of different layers on Ped2 dataset

表4 网络中不同模块组合的性能 ( %)

Tab. 4 Performance of different module combinations in network

方法	AUC	EER
Base	94.0	12.4
Base+IndRNN	95.6	10.9
Base+IndRNN+GAN	96.2	8.8

表5 网络中不同损失函数组合的性能 ( %)

Tab. 5 Performance of different loss functions combinations in network

损失函数	AUC
梯度损失+多尺度结构相似性损失	93.9
梯度损失+混合损失	95.9
梯度损失+混合损失+全变分损失	96.2

参考文献 21

1	胡正平，张乐，李淑芳，等. 视频监控系统异常目标检测与定位综述［J］. 燕山大学学报， 2019， 43（1）： 1-12. 10.3969/j.issn.1007-791X.2019.01.001
	HU Z P， ZHANG L， LI S F， et al. Review of abnormal behavior detection and location for intelligent video surveillance systems［J］. Journal of Yanshan University， 2019， 43（1）： 1-12. 10.3969/j.issn.1007-791X.2019.01.001
2	郑併斌，范新南，李敏，等. 基于轨迹分段LDA主题模型的视频异常行为检测方法［J］. 计算机应用， 2015， 35（2）：515-518， 565. 10.11772/j.issn.1001-9081.2015.02.0515
	ZHENG B B， FAN X N， LI M， et al. Trajectory segment-based abnormal behavior detection method using LDA model［J］. Journal of Computer Applications， 2015， 35（2）：515-518， 565. 10.11772/j.issn.1001-9081.2015.02.0515
3	DALAL N， TRIGGS B. Histograms of oriented gradients for human detection［C］// Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 1. Piscataway： IEEE， 2005： 886-893. 10.1109/cvpr.2005.4
4	DALAL N， TRIGGS B， SCHMID C. Human detection using oriented histograms of flow and appearance［C］// Proceedings of the 2006 European Conference on Computer Vision， LNCS 3952. Berlin： Springer， 2006： 428-441.
5	CHAN A B， Modeling VASCONCELOS N.， clustering， and segmenting video with mixtures of dynamic textures［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2008， 30（5）：909-926. 10.1109/tpami.2007.70738
6	MEHRAN R， OYAMA A， SHAH M. Abnormal crowd behavior detection using social force model［C］// Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2009： 935-942. 10.1109/cvpr.2009.5206641
7	李敏，刘轲，罗惠琼，等. 基于混合高斯模型的异常检测算法改进［J］.计算机应用与软件， 2014， 31（6）： 198-200. 10.3969/j.issn.1000-386x.2014.06.054
	LI M， LIU K， LUO H Q， et al. Anomaly detection algorithm improvement based on Gaussian mixture model［J］. Computer Applications and Software， 2014， 31（6）： 198-200. 10.3969/j.issn.1000-386x.2014.06.054
8	徐涛，田崇阳，刘才华. 基于深度学习的人群异常行为检测综述［J］. 计算机科学， 2021， 48（9）： 125-134. 10.11896/jsjkx.201100015
	XU T， TIAN C Y， LIU C H. Deep learning for abnormal crowd behavior detection： a review［J］. Computer Science， 2021， 48（9）： 125-134. 10.11896/jsjkx.201100015
9	HASAN M， CHOI J， NEUMANN J， et al. Learning temporal regularity in video sequences［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 733-742. 10.1109/cvpr.2016.86
10	IONESCU R， SMEUREANU S， ALEXE B， et al. Unmasking the abnormal events in video［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017： 2914-2922. 10.1109/iccv.2017.315
11	LIU W， LUO W X， LIAN D Z， et al. Future frame prediction for anomaly detection — a new baseline［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 6536-6545. 10.1109/cvpr.2018.00684
12	ZHOU J T， ZHANG L， FANG Z W， et al. Attention-driven loss for anomaly detection in video surveillance［J］. IEEE Transactions on Circuits and Systems for Video Technology， 2020， 30（12）： 4639-4647. 10.1109/tcsvt.2019.2962229
13	FAN Y X， WEN G J， LI D R， et al. Video anomaly detection and localization via Gaussian mixture fully convolutional variational autoencoder［J］. Computer Vision and Image Understanding， 2020， 195： No.102920. 10.1016/j.cviu.2020.102920
14	DEEPAK K， CHANDRAKALA S， MOHAN C K. Residual spatiotemporal autoencoder for unsupervised video anomaly detection［J］. Signal， Image and Video Processing， 2021， 15（1）： 215-222. 10.1007/s11760-020-01740-1
15	NAWARATNE R， ALAHAKOON D， DE SILVA D， et al. Spatiotemporal anomaly detection using deep learning for real-time video surveillance［J］. IEEE Transactions on Industrial Informatics， 2020， 16（1）： 393-402. 10.1109/tii.2019.2938527
16	YAN S Y， SMITH J S， LU W J， et al. Abnormal event detection from videos using a two-stream recurrent variational autoencoder［J］. IEEE Transactions on Cognitive and Developmental Systems， 2020， 12（1）： 30-42. 10.1109/tcds.2018.2883368
17	LI S， LI W Q， COOK C， et al. Independently Recurrent Neural Network （IndRNN）： building a longer and deeper RNN［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 5457-5466. 10.1109/cvpr.2018.00572
18	KINGMA D P， WELLING M. Auto-encoding variational Bayes［EB/OL］. （2014-05-01）［2021-11-01］.. 10.1561/2200000056
19	MAKHZANI A， SHLENS J， JAITLY N， et al. Adversarial autoencoders［EB/OL］. （2016-05-25）［2021-11-01］..
20	GOODFELLOW I J， POUGET-ABADIE J， MIRZA M， et al. Generative adversarial nets［C］// Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2. Cambridge： MIT Press， 2014： 2672-2680.
21	MAHENDRAN A， VEDALDI A. Understanding deep image representations by inverting them［C］// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2015： 5188-5196. 10.1109/cvpr.2015.7299155

[1]	张新宇, 丁胜, 杨治佩. 基于改进注意力机制的交通标志检测算法[J]. 《计算机应用》唯一官方网站, 2022, 42(8): 2378-2385.
[2]	秦庭威, 赵鹏程, 秦品乐, 曾建朝, 柴锐, 黄永琦. 基于残差注意力机制的点云配准算法[J]. 《计算机应用》唯一官方网站, 2022, 42(7): 2184-2191.
[3]	仇天昊, 陈淑荣. 基于EfficientNet的双分路多尺度联合学习行人再识别[J]. 《计算机应用》唯一官方网站, 2022, 42(7): 2065-2071.
[4]	季长清, 高志勇, 秦静, 汪祖民. 基于卷积神经网络的图像分类算法综述[J]. 《计算机应用》唯一官方网站, 2022, 42(4): 1044-1049.
[5]	丁行硕, 李翔, 谢乾. 基于标签分层延深建模的企业画像构建方法[J]. 《计算机应用》唯一官方网站, 2022, 42(4): 1170-1177.
[6]	李讷, 徐光柱, 雷帮军, 马国亮, 石勇涛. 交通道路行驶车辆车标识别算法[J]. 《计算机应用》唯一官方网站, 2022, 42(3): 810-817.
[7]	胡聪, 华钢. 基于注意力机制的弱监督动作定位方法[J]. 《计算机应用》唯一官方网站, 2022, 42(3): 960-967.
[8]	向南, 潘传忠, 虞高翔. 融合优化特征提取结构的目标检测算法[J]. 《计算机应用》唯一官方网站, 2022, 42(11): 3558-3563.
[9]	杜宇, 严萌, 武昕. 基于上采样金字塔结构的卷积神经网络的非侵入负荷辨识算法[J]. 《计算机应用》唯一官方网站, 2022, 42(10): 3300-3306.
[10]	张谊, 万华, 涂淑琴. 基于计算机视觉的中药饮片分类技术综述与案例研究[J]. 《计算机应用》唯一官方网站, 2022, 42(10): 3224-3234.
[11]	郑志强, 胡鑫, 翁智, 王雨禾, 程曦. 基于改进DenseNet的牛眼图像特征提取方法[J]. 计算机应用, 2021, 41(9): 2780-2784.
[12]	孙鹤立, 孙玉柱, 张晓云. 基于生成对抗网络的事件描述生成[J]. 计算机应用, 2021, 41(5): 1256-1261.
[13]	佘玉龙, 张晓龙, 程若勤, 邓春华. 基于边缘关注模型的语义分割方法[J]. 计算机应用, 2021, 41(2): 343-349.
[14]	陈宪聪, 潘微科, 明仲. 面向异构单类协同过滤的阶段式变分自编码器[J]. 《计算机应用》唯一官方网站, 2021, 41(12): 3499-3507.
[15]	赵津, 宋文爱, 邰隽, 杨吉江, 王青, 李晓丹, 雷毅, 邱悦. 儿童阻塞性睡眠呼吸暂停计算机人脸辅助诊断综述[J]. 《计算机应用》唯一官方网站, 2021, 41(11): 3394-3401.

基于独立循环神经网络与变分自编码网络的视频帧异常检测

Anomaly detection in video via independently recurrent neural network and variational autoencoder network

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 14

参考文献 21

相关文章 15

编辑推荐

Metrics