面向运动前景区域的视频异常检测

doi:10.11772/j.issn.1001-9081.2024040519

《计算机应用》唯一官方网站 ›› 2025, Vol. 45 ›› Issue (4): 1300-1309.DOI: 10.11772/j.issn.1001-9081.2024040519

• 多媒体计算与计算机仿真 • 上一篇下一篇

面向运动前景区域的视频异常检测

潘理虎(), 彭守信, 张睿, 薛之洋, 毛旭珍

太原科技大学计算机科学与技术学院，太原 030024

收稿日期:2024-04-25 修回日期:2024-09-12 接受日期:2024-09-14 发布日期:2025-04-08 出版日期:2025-04-10
通讯作者: 潘理虎
作者简介:彭守信（1998—），男，江西九江人，硕士研究生，主要研究方向：深度学习、视频异常检测
张睿（1987—），男，山西太原人，副教授，博士，主要研究方向：智能信息处理、自动机器学习
薛之洋（1999—），男，山西太原人，硕士研究生，主要研究方向：人工智能、目标检测
毛旭珍（1995—），女，山西吕梁人，硕士研究生，主要研究方向：视频异常检测。
基金资助:
山西省基础研究计划项目(202203021221145);山西省研究生联合培养示范基地项目(2022JD11)

Video anomaly detection for moving foreground regions

Lihu PAN(), Shouxin PENG, Rui ZHANG, Zhiyang XUE, Xuzhen MAO

College of Computer Science and Technology，Taiyuan University of Science and Technology，Taiyuan Shanxi 030024，China

Received:2024-04-25 Revised:2024-09-12 Accepted:2024-09-14 Online:2025-04-08 Published:2025-04-10
Contact: Lihu PAN
About author:PENG Shouxin， born in 1998， M. S. candidate. His research interests include deep learning， video anomaly detection.
ZHANG Rui， born in 1987， Ph. D.， associate professor. His research interests include intelligent information processing， automated machine learning.
XUE Zhiyang， born in 1999， M. S. candidate. His research interests include artificial intelligence， object detection.
MAO Xuzhen， born in 1995， M. S. candidate. Her research interests include video anomaly detection.
Supported by:
Shanxi Provincial Basic Research Program(202203021221145);Shanxi Province Graduate Joint Cultivation Demonstration Base Project(2022JD11)

摘要/Abstract

摘要：

静态背景信息和运动前景对象的数据分布不平衡通常会引起发生异常的前景区域信息学习不充分问题，进而影响视频异常检测（VAD）的精度。为了解决上述问题，提出一种用于VAD的嵌套U型帧预测生成对抗网络（NUFP-GAN）方法。所提方法使用具有突出视频帧中显著目标能力的嵌套U型帧预测网络架构作为帧预测模块，并在判别阶段设计一个自注意力补丁判别器，应用不同大小的感受野提取视频帧中更重要的外观和运动特征，以提升异常检测的准确性。此外，为保证预测帧和真实帧在高级语义信息上的多尺度特征一致性，引入多尺度一致性损失，以进一步提升方法的异常检测效果。实验结果表明，所提方法在CUHK Avenue、UCSD Ped1、UCSD Ped2和ShanghaiTech数据集上的曲线下面积（AUC）值分别达到了87.6%、85.2%、96.0%和73.3%；与MAMC （Memory-enhanced Appearance-Motion Consistency）方法相比，所提方法在ShanghaiTech数据集上的AUC值提升了1.8个百分点。可见，所提方法能够有效应对VAD中数据分布不平衡带来的挑战。

关键词: 深度学习, 视频异常检测, 生成对抗网络, 未来帧预测, 无监督学习

Abstract:

Imbalance in data distribution between static background information and moving foreground objects often leads to insufficient learning of abnormal foreground region information， thereby affecting the accuracy of Video Anomaly Detection （VAD）. To address this issue， a Nested U-shaped Frame Predictive Generative Adversarial Network （NUFP-GAN） was proposed for VAD. In the proposed method， a nested U-shaped frame prediction network architecture， which had the capability to highlight significant targets in video frames， was utilized as the frame prediction module. In the discrimination phase， a self-attention patch discriminator was designed to extract more important appearance and motion features from video frames using receptive fields of different sizes， thereby enhancing the accuracy of anomaly detection. Additionally， to ensure the consistency of multi-scale features of predicted frames and real frames in high-level semantic information， a multi-scale consistency loss was introduced to further improve the method’s anomaly detection performance. Experimental results show that the proposed method achieves the Area Under Curve （AUC） values of 87.6%， 85.2%， 96.0%， and 73.3%， respectively， on CUHK Avenue， UCSD Ped1， UCSD Ped2， and ShanghaiTech datasets； on ShanghaiTech dataset， the AUC value of the proposed method is 1.8 percentage points higher than that of MAMC （Memory-enhanced Appearance-Motion Consistency） method. It can be seen that the proposed method can meet the challenges brought by data distribution imbalance in VAD effectively.

Key words: deep learning, Video Anomaly Detection (VAD), Generative Adversarial Network (GAN), future frame prediction, unsupervised learning

中图分类号:

TP391.4

潘理虎, 彭守信, 张睿, 薛之洋, 毛旭珍. 面向运动前景区域的视频异常检测[J]. 计算机应用, 2025, 45(4): 1300-1309.

Lihu PAN, Shouxin PENG, Rui ZHANG, Zhiyang XUE, Xuzhen MAO. Video anomaly detection for moving foreground regions[J]. Journal of Computer Applications, 2025, 45(4): 1300-1309.

图/表 11

图1 NUFP-GAN的模型架构

Fig. 1 Model architecture of NUFP-GAN

图2 RSU的网络结构

Fig. 2 Network structure of RSU

图3 自注意力模块的网络结构

Fig. 3 Network structure of self-attention module

表1 各类型方法在不同数据集上的AUC值 (%)

Tab. 1 AUC values of different types of methods on different datasets

方法类型	方法	CUHK Avenue	UCSD Ped1	UCSD Ped2	ShanghaiTech
传统	文献［40］方法	—	59.0	69.3	—
	文献［2］方法	—	81.8	82.9	—
	文献［41］方法	78.3	—	—	—
重构	Conv-AE^［5］	80.0	75.0	85.0	60.9
	ConvLSTM-AE^［42］	77.0	75.5	88.1	—
	sRNN^［3］	81.7	—	92.2	68.0
	MemAE^［21］	83.3	—	94.1	71.2
	MNAD-Recon^［20］	82.8	—	90.2	69.8
	sRNN-AE^［43］	83.5	—	92.2	69.6
预测	文献［8］方法	84.9	83.1	95.4	72.8
	文献［44］方法	85.3	83.7	95.9	73.7
	文献［45］方法	86.0	—	95.4	74.1
	MAMC^［46］	87.6	—	96.7	71.5
重构+ 预测	文献［48］方法	83.7	—	96.2	71.5
重构+ 预测	文献［47］方法	80.6	—	89.4	68.9
NUFP-GAN		87.6	85.2	96.0	73.3

表2 3个公开数据集上本文方法与先进方法的EER性能对比分析 (%)

Tab. 2 EER performance comparison and analysis of proposed method and state-of-the-art methods on three public datasets

方法	UCSD Ped1	UCSD Ped2	CUHK Avenue
文献［40］方法	40.0	30.0	—
文献［2］方法	25.0	25.0	—
Conv-AE^［5］	27.9	21.7	25.1
文献［8］方法	—	11.9	22.3
文献［49］方法	25.2	12.5	20.7
ST-CaAE^［50］	15.3	16.7	24.4
文献［45］方法	—	20.0	23.0
NUFP-GAN	21.4	10.2	20.3

图4 在3个公开数据集上正常帧和异常帧的平均分数差值

Fig. 4 Average score difference between normal and abnormal frames on three public datasets

图5 不同数据集上的正常得分曲线

Fig. 5 Normality score curves on different datasets

图6 PSNR与异常事件的对应关系

Fig. 6 Corresponding relationship between PSNR and abnormal event

图7 UCSD Ped2数据集上的显著性特征图与预测误差热力图

Fig. 7 Saliency feature maps and prediction error heatmaps on UCSD Ped2 dataset

图8 不同λmsc在 UCSD Ped1 数据集上的AUC性能

Fig. 8 AUC performance with different values of λmsc on UCSD Ped1 dataset

表3 在UCSD Ped1数据集上的消融实验

Tab. 3 Ablation experiments on UCSD Ped1 dataset

组合方式序号	基准模型	嵌套U型帧预测网络	自注意力补丁判别器	多尺度一致性损失	AUC/%
1	√	×	×	×	83.1
2	×	√	×	×	84.3
3	×	×	√	×	84.1
4	×	×	×	√	83.2
5	×	√	√	×	85.0
6	×	×	√	√	84.6
7	×	√	×	√	84.9
8	×	√	√	√	85.2

参考文献 50

1	BENEZETH Y， JODOIN P M， SALIGRAMA V， et al. Abnormal events detection based on spatio-temporal co-occurences［C］// Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2009： 2458-2465.
2	MAHADEVAN V， LI W， BHALODIA V， et al. Anomaly detection in crowded scenes［C］// Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2010： 1975-1981.
3	LUO W， LIU W， GAO S. A revisit of sparse coding based anomaly detection in stacked RNN framework［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017： 341-349.
4	MEHRAN R， OYAMA A， SHAH M. Abnormal crowd behavior detection using social force model［C］// Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2009： 935-942.
5	HASAN M， CHOI J， NEUMANN J， et al. Learning temporal regularity in video sequences［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 733-742.
6	SABOKROU M， KHALOOEI M， FATHY M， et al. Adversarially learned one-class classifier for novelty detection［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 3379-3388.
7	GOODFELLOW I， POUGET-ABADIE J， MIRZA M， et al. Generative adversarial networks［J］. Communications of the ACM， 2020， 63（11）： 139-144.
8	LIU W， LUO W， LIAN D， et al. Future frame prediction for anomaly detection — a new baseline［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 6536-6545.
9	LI C， LI H， ZHANG G. Future frame prediction based on generative assistant discriminative network for anomaly detection［J］. Applied Intelligence， 2023， 53： 542-559.
10	SINGH R， SETHI A， SAINI K， et al. CVAD-GAN： constrained video anomaly detection via generative adversarial network［J］. Image and Vision Computing， 2024， 143： No.104950.
11	LI H， CHEN J， HUANG X， et al. FOAD： a novel video anomaly detection focusing on objects［J］. Multimedia Tools and Applications， 2024， 83（7）： 20637-20651.
12	XU H， LIU W， XING W， et al. Motion-aware future frame prediction for video anomaly detection based on saliency perception［J］. Signal， Image and Video Processing， 2022， 16： 2121-2129.
13	QIN X， ZHANG Z， HUANG C， et al. U²-Net： going deeper with nested U-structure for salient object detection［J］. Pattern Recognition， 2020， 106： No.107404.
14	LU C， SHI J， JIA J. Abnormal event detection at 150 FPS in MATLAB［C］// Proceedings of the 2013 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2013： 2720-2727.
15	GIRSHICK R. Fast R-CNN［C］// Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2015： 1440-1448.
16	HU X， LIAN J， ZHANG D， et al. Video anomaly detection based on 3D convolutional auto-encoder［J］. Signal， Image and Video Processing， 2022， 16： 1885-1893.
17	WANG L， ZHOU F， LI Z， et al. Abnormal event detection in videos using hybrid spatio-temporal autoencoder［C］// Proceedings of the 25th IEEE International Conference on Image Processing. Piscataway： IEEE， 2018： 2276-2280.
18	SHAO W， RAJAPAKSHA P， WEI Y， et al. COVAD： content-oriented video anomaly detection using a self-attention based deep learning model［J］. Virtual Reality and Intelligent Hardware， 2023， 5（1）： 24-41.
19	郑重，杨晓文，谢剑斌，等. 融合混合注意力的自编码器视频异常检测［J］. 计算机工程与设计， 2024， 45（2）：516-523.
	ZHENG Z， YANG X W， XIE J B， et al. Autoencoder video anomaly detection based on hybrid attention［J］. Computer Engineering and Design， 2024， 45（2）： 516-523.
20	PARK H， NOH J， HAM B. Learning memory-guided normality for anomaly detection［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 14360-14369.
21	GONG D， LIU L， LE V， et al. Memorizing normality to detect anomaly： memory-augmented deep autoencoder for unsupervised anomaly detection［C］// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2019： 1705-1714.
22	LE V T， KIM Y G. Attention-based residual autoencoder for video anomaly detection［J］. Applied Intelligence， 2023， 53（3）： 3240-3254.
23	NGUYEN T N， MEUNIER J. Anomaly detection in video sequence with appearance-motion correspondence［C］// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2019： 1273-1283.
24	DOSHI K， YILMAZ Y. Online anomaly detection in surveillance videos with asymptotic bound on false alarm rate［J］. Pattern Recognition， 2021， 114： No.107865.
25	ZAHEER M Z， MAHMOOD A， KHAN M H， et al. Generative cooperative learning for unsupervised video anomaly detection［C］// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2022： 14724-14734.
26	AKCAY S， ATAPOUR-ABARGHOUEI A， BRECKON T P. GANomaly： semi-supervised anomaly detection via adversarial training［C］// Proceedings of the 2018 Asian Conference on Computer Vision， LNCS 11363. Cham： Springer， 2019： 622-637.
27	PERERA P， NALLAPATI R， XIANG B. OCGAN： one-class novelty detection using GANs with constrained latent representations［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2019： 2893-2901.
28	ACSINTOAE A， FLORESCU A， GEORGESCU M I， et al. Ubnormal： new benchmark for supervised open-set video anomaly detection［C］// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2022： 20111-20121.
29	LEE S， KIM H G， RO Y M. STAN： spatio-temporal adversarial networks for abnormal event detection［C］// Proceedings of the 2018 IEEE International Conference on Acoustics， Speech and Signal Processing. Piscataway： IEEE， 2018： 1323-1327.
30	LIU Y， LIU J， YANG K， et al. AMP-net： appearance-motion prototype network assisted automatic video anomaly detection system［J］. IEEE Transactions on Industrial Informatics， 2024， 20（2）： 2843-2855.
31	SUN Z， WANG P， ZHENG W， et al. Dual GroupGAN： an unsupervised four-competitor （2V2） approach for video anomaly detection［J］. Pattern Recognition， 2024， 153： No.110500.
32	SINGH R， SETHI A， SAINI K， et al. Attention-guided generator with dual discriminator GAN for real-time video anomaly detection［J］. Engineering Applications of Artificial Intelligence， 2024， 131： No.107830.
33	LIM B， SON S， KIM H， et al. Enhanced deep residual networks for single image super-resolution［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 1132-1140.
34	VASWANI A， SHAZEER N， PARMAR N， et al. Attention is all you need［C］// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2017： 6000-6010.
35	ISOLA P， ZHU J Y， ZHOU T， et al. Image-to-image translation with conditional adversarial networks［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 5967-5976.
36	JOHNSON J， ALAHI A， LI F F. Perceptual losses for real-time style transfer and super-resolution［C］// Proceedings of the 2016 European Conference Computer Vision， LNCS 9906. Cham： Springer， 2016： 694-711.
37	WANG X， CHE Z， JIANG B， et al. Robust unsupervised video anomaly detection by multipath frame prediction［J］. IEEE Transactions on Neural Networks and Learning Systems， 2022， 33（6）： 2301-2312.
38	MATHTHIEU M， COUPRIE C， LeCUN Y. Deep multi-scale video prediction beyond mean square error［EB/OL］. （2016-02-26）［2024-03-01］..
39	郭方圆，吉根林. 基于双鉴别器和伪视频生成的视频异常检测方法［J］. 计算机科学， 2024， 51（8）： 217-223.
	GUO F Y， JI G L. Video anomaly detection method based on dual discriminators and pseudo video generation［J］. Computer Science， 2024， 51（8）： 217-223.
40	KIM J， GRAUAMN K. Observe locally， infer globally： a space-time MRF for detecting abnormal activities with incremental updates［C］// Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2009： 2921-2928.
41	DEL GIORNO A， BAGNELL J A， HEBERT M. A discriminative framework for anomaly detection in large videos［C］// Proceedings of the 2016 European Conference Computer Vision， LNCS 9909. Cham： Springer， 2016： 334-349.
42	LUO W， LIU W， GAO S. Remembering history with convolutional LSTM for anomaly detection［C］// Proceedings of the 2017 IEEE International Conference on Multimedia and Expo. Piscataway： IEEE， 2017： 439-444.
43	LUO W， LIU W， LIAN D， et al. Video anomaly detection with sparse coding inspired deep neural networks［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2021， 43（3）： 1070-1084.
44	LU Y， YU F， REDDY M K K， et al. Few-shot scene-adaptive anomaly detection［C］// Proceedings of the 2020 European Conference Computer Vision， LNCS 12350. Cham： Springer， 2020： 125-141.
45	LI Q， YANG R， XIAO F， et al. Attention-based anomaly detection in multi-view surveillance videos［J］. Knowledge-Based Systems， 2022， 252： No.109348.
46	NING Z， WANG Z， LIU Y， et al. Memory-enhanced appearance-motion consistency framework for video anomaly detection［J］. Computer Communications， 2024， 216： 159-167.
47	CHANG Y， TU Z， XIE W， et al. Video anomaly detection with spatio-temporal dissociation［J］. Pattern Recognition， 2022， 122： No.108213.
48	TANG Y， ZHAO L， ZHANG S， et al. Integrating prediction and reconstruction for anomaly detection［J］. Pattern Recognition Letters， 2020， 129： 123-130.
49	WU P， LIU J， LI M， et al. Fast sparse coding networks for anomaly detection in videos［J］. Pattern Recognition， 2020， 107： No.107515.
50	LI N， CHANG F， LIU C. Spatial-temporal cascade autoencoder for video anomaly detection in crowded scenes［J］. IEEE Transactions on Multimedia， 2021， 23： 203-215.

[1]	王一丁, 王泽浩, 李耀利, 蔡少青, 袁媛. 多尺度2D-Adaboost的中药材粉末显微图像识别算法[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1325-1332.
[2]	周阳, 李辉. 基于语义和细节特征双促进的遥感影像建筑物提取网络[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1310-1316.
[3]	安俊秀, 杨林旺, 柳源. 基于邻近性语义感知的无监督文本风格迁移[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1139-1147.
[4]	薛振华, 李强, 黄超. 视觉基础模型驱动的像素级图像异常检测方法[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 823-831.
[5]	陈瑞龙, 胡涛, 卜佑军, 伊鹏, 胡先君, 乔伟. 面向加密恶意流量检测模型的堆叠集成对抗防御方法[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 864-871.
[6]	洪梓榕, 包广清. 基于集成学习的雷达自动目标识别综述[J]. 《计算机应用》唯一官方网站, 2025, 45(2): 371-382.
[7]	张众维, 王俊, 刘树东, 王志恒. 多尺度特征融合与加权框融合的遥感图像目标检测[J]. 《计算机应用》唯一官方网站, 2025, 45(2): 633-639.
[8]	张天骐, 谭霜, 沈夕文, 唐娟. 融合注意力机制和多尺度特征的图像水印方法[J]. 《计算机应用》唯一官方网站, 2025, 45(2): 616-623.
[9]	李严, 叶冠华, 李雅文, 梁美玉. 基于丰度协调技术的企业ESG指标预测模型[J]. 《计算机应用》唯一官方网站, 2025, 45(2): 670-676.
[10]	上官宏, 任慧莹, 张雄, 韩兴隆, 桂志国, 王燕玲. 基于双编码器双解码器GAN的低剂量CT降噪模型[J]. 《计算机应用》唯一官方网站, 2025, 45(2): 624-632.
[11]	邓淼磊, 阚雨培, 孙川川, 徐海航, 樊少珺, 周鑫. 基于深度学习的网络入侵检测系统综述[J]. 《计算机应用》唯一官方网站, 2025, 45(2): 453-466.
[12]	余松森, 林智凡, 薛国鹏, 徐建宇. 基于改进YOLOv8的轻量级大幅面瓷砖缺陷检测算法[J]. 《计算机应用》唯一官方网站, 2025, 45(2): 647-654.
[13]	丁丹妮, 彭博, 吴锡. 受腹侧通路启发的脂肪肝超声图像分类方法VPNet[J]. 《计算机应用》唯一官方网站, 2025, 45(2): 662-669.
[14]	徐欣然, 张绍兵, 成苗, 张洋, 曾尚. 基于多路层次化混合专家模型的轴承故障诊断方法[J]. 《计算机应用》唯一官方网站, 2025, 45(1): 59-68.
[15]	梁杰涛, 罗兵, 付兰慧, 常青玲, 李楠楠, 易宁波, 冯其, 何鑫, 邓辅秦. 基于坐标几何采样的点云配准方法[J]. 《计算机应用》唯一官方网站, 2025, 45(1): 214-222.

面向运动前景区域的视频异常检测

Video anomaly detection for moving foreground regions

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 11

参考文献 50

相关文章 15

编辑推荐

Metrics