Video anomaly detection for moving foreground regions

doi:10.11772/j.issn.1001-9081.2024040519

Journal of Computer Applications ›› 2025, Vol. 45 ›› Issue (4): 1300-1309.DOI: 10.11772/j.issn.1001-9081.2024040519

• Multimedia computing and computer simulation • Previous Articles Next Articles

Video anomaly detection for moving foreground regions

Lihu PAN(), Shouxin PENG, Rui ZHANG, Zhiyang XUE, Xuzhen MAO

College of Computer Science and Technology，Taiyuan University of Science and Technology，Taiyuan Shanxi 030024，China

Received:2024-04-25 Revised:2024-09-12 Accepted:2024-09-14 Online:2025-04-08 Published:2025-04-10
Contact: Lihu PAN
About author:PENG Shouxin， born in 1998， M. S. candidate. His research interests include deep learning， video anomaly detection.
ZHANG Rui， born in 1987， Ph. D.， associate professor. His research interests include intelligent information processing， automated machine learning.
XUE Zhiyang， born in 1999， M. S. candidate. His research interests include artificial intelligence， object detection.
MAO Xuzhen， born in 1995， M. S. candidate. Her research interests include video anomaly detection.
Supported by:
Shanxi Provincial Basic Research Program(202203021221145);Shanxi Province Graduate Joint Cultivation Demonstration Base Project(2022JD11)

面向运动前景区域的视频异常检测

潘理虎(), 彭守信, 张睿, 薛之洋, 毛旭珍

太原科技大学计算机科学与技术学院，太原 030024

通讯作者: 潘理虎
作者简介:彭守信（1998—），男，江西九江人，硕士研究生，主要研究方向：深度学习、视频异常检测
张睿（1987—），男，山西太原人，副教授，博士，主要研究方向：智能信息处理、自动机器学习
薛之洋（1999—），男，山西太原人，硕士研究生，主要研究方向：人工智能、目标检测
毛旭珍（1995—），女，山西吕梁人，硕士研究生，主要研究方向：视频异常检测。
基金资助:
山西省基础研究计划项目(202203021221145);山西省研究生联合培养示范基地项目(2022JD11)

Abstract

Abstract:

Imbalance in data distribution between static background information and moving foreground objects often leads to insufficient learning of abnormal foreground region information， thereby affecting the accuracy of Video Anomaly Detection （VAD）. To address this issue， a Nested U-shaped Frame Predictive Generative Adversarial Network （NUFP-GAN） was proposed for VAD. In the proposed method， a nested U-shaped frame prediction network architecture， which had the capability to highlight significant targets in video frames， was utilized as the frame prediction module. In the discrimination phase， a self-attention patch discriminator was designed to extract more important appearance and motion features from video frames using receptive fields of different sizes， thereby enhancing the accuracy of anomaly detection. Additionally， to ensure the consistency of multi-scale features of predicted frames and real frames in high-level semantic information， a multi-scale consistency loss was introduced to further improve the method’s anomaly detection performance. Experimental results show that the proposed method achieves the Area Under Curve （AUC） values of 87.6%， 85.2%， 96.0%， and 73.3%， respectively， on CUHK Avenue， UCSD Ped1， UCSD Ped2， and ShanghaiTech datasets； on ShanghaiTech dataset， the AUC value of the proposed method is 1.8 percentage points higher than that of MAMC （Memory-enhanced Appearance-Motion Consistency） method. It can be seen that the proposed method can meet the challenges brought by data distribution imbalance in VAD effectively.

Key words: deep learning, Video Anomaly Detection (VAD), Generative Adversarial Network (GAN), future frame prediction, unsupervised learning

摘要：

静态背景信息和运动前景对象的数据分布不平衡通常会引起发生异常的前景区域信息学习不充分问题，进而影响视频异常检测（VAD）的精度。为了解决上述问题，提出一种用于VAD的嵌套U型帧预测生成对抗网络（NUFP-GAN）方法。所提方法使用具有突出视频帧中显著目标能力的嵌套U型帧预测网络架构作为帧预测模块，并在判别阶段设计一个自注意力补丁判别器，应用不同大小的感受野提取视频帧中更重要的外观和运动特征，以提升异常检测的准确性。此外，为保证预测帧和真实帧在高级语义信息上的多尺度特征一致性，引入多尺度一致性损失，以进一步提升方法的异常检测效果。实验结果表明，所提方法在CUHK Avenue、UCSD Ped1、UCSD Ped2和ShanghaiTech数据集上的曲线下面积（AUC）值分别达到了87.6%、85.2%、96.0%和73.3%；与MAMC （Memory-enhanced Appearance-Motion Consistency）方法相比，所提方法在ShanghaiTech数据集上的AUC值提升了1.8个百分点。可见，所提方法能够有效应对VAD中数据分布不平衡带来的挑战。

关键词: 深度学习, 视频异常检测, 生成对抗网络, 未来帧预测, 无监督学习

CLC Number:

TP391.4

Lihu PAN, Shouxin PENG, Rui ZHANG, Zhiyang XUE, Xuzhen MAO. Video anomaly detection for moving foreground regions[J]. Journal of Computer Applications, 2025, 45(4): 1300-1309.

潘理虎, 彭守信, 张睿, 薛之洋, 毛旭珍. 面向运动前景区域的视频异常检测[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1300-1309.

Figures/Tables 11

References 50

1	BENEZETH Y， JODOIN P M， SALIGRAMA V， et al. Abnormal events detection based on spatio-temporal co-occurences［C］// Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2009： 2458-2465.
2	MAHADEVAN V， LI W， BHALODIA V， et al. Anomaly detection in crowded scenes［C］// Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2010： 1975-1981.
3	LUO W， LIU W， GAO S. A revisit of sparse coding based anomaly detection in stacked RNN framework［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017： 341-349.
4	MEHRAN R， OYAMA A， SHAH M. Abnormal crowd behavior detection using social force model［C］// Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2009： 935-942.
5	HASAN M， CHOI J， NEUMANN J， et al. Learning temporal regularity in video sequences［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 733-742.
6	SABOKROU M， KHALOOEI M， FATHY M， et al. Adversarially learned one-class classifier for novelty detection［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 3379-3388.
7	GOODFELLOW I， POUGET-ABADIE J， MIRZA M， et al. Generative adversarial networks［J］. Communications of the ACM， 2020， 63（11）： 139-144.
8	LIU W， LUO W， LIAN D， et al. Future frame prediction for anomaly detection — a new baseline［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 6536-6545.
9	LI C， LI H， ZHANG G. Future frame prediction based on generative assistant discriminative network for anomaly detection［J］. Applied Intelligence， 2023， 53： 542-559.
10	SINGH R， SETHI A， SAINI K， et al. CVAD-GAN： constrained video anomaly detection via generative adversarial network［J］. Image and Vision Computing， 2024， 143： No.104950.
11	LI H， CHEN J， HUANG X， et al. FOAD： a novel video anomaly detection focusing on objects［J］. Multimedia Tools and Applications， 2024， 83（7）： 20637-20651.
12	XU H， LIU W， XING W， et al. Motion-aware future frame prediction for video anomaly detection based on saliency perception［J］. Signal， Image and Video Processing， 2022， 16： 2121-2129.
13	QIN X， ZHANG Z， HUANG C， et al. U²-Net： going deeper with nested U-structure for salient object detection［J］. Pattern Recognition， 2020， 106： No.107404.
14	LU C， SHI J， JIA J. Abnormal event detection at 150 FPS in MATLAB［C］// Proceedings of the 2013 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2013： 2720-2727.
15	GIRSHICK R. Fast R-CNN［C］// Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2015： 1440-1448.
16	HU X， LIAN J， ZHANG D， et al. Video anomaly detection based on 3D convolutional auto-encoder［J］. Signal， Image and Video Processing， 2022， 16： 1885-1893.
17	WANG L， ZHOU F， LI Z， et al. Abnormal event detection in videos using hybrid spatio-temporal autoencoder［C］// Proceedings of the 25th IEEE International Conference on Image Processing. Piscataway： IEEE， 2018： 2276-2280.
18	SHAO W， RAJAPAKSHA P， WEI Y， et al. COVAD： content-oriented video anomaly detection using a self-attention based deep learning model［J］. Virtual Reality and Intelligent Hardware， 2023， 5（1）： 24-41.
19	郑重，杨晓文，谢剑斌，等. 融合混合注意力的自编码器视频异常检测［J］. 计算机工程与设计， 2024， 45（2）：516-523.
	ZHENG Z， YANG X W， XIE J B， et al. Autoencoder video anomaly detection based on hybrid attention［J］. Computer Engineering and Design， 2024， 45（2）： 516-523.
20	PARK H， NOH J， HAM B. Learning memory-guided normality for anomaly detection［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 14360-14369.
21	GONG D， LIU L， LE V， et al. Memorizing normality to detect anomaly： memory-augmented deep autoencoder for unsupervised anomaly detection［C］// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2019： 1705-1714.
22	LE V T， KIM Y G. Attention-based residual autoencoder for video anomaly detection［J］. Applied Intelligence， 2023， 53（3）： 3240-3254.
23	NGUYEN T N， MEUNIER J. Anomaly detection in video sequence with appearance-motion correspondence［C］// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2019： 1273-1283.
24	DOSHI K， YILMAZ Y. Online anomaly detection in surveillance videos with asymptotic bound on false alarm rate［J］. Pattern Recognition， 2021， 114： No.107865.
25	ZAHEER M Z， MAHMOOD A， KHAN M H， et al. Generative cooperative learning for unsupervised video anomaly detection［C］// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2022： 14724-14734.
26	AKCAY S， ATAPOUR-ABARGHOUEI A， BRECKON T P. GANomaly： semi-supervised anomaly detection via adversarial training［C］// Proceedings of the 2018 Asian Conference on Computer Vision， LNCS 11363. Cham： Springer， 2019： 622-637.
27	PERERA P， NALLAPATI R， XIANG B. OCGAN： one-class novelty detection using GANs with constrained latent representations［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2019： 2893-2901.
28	ACSINTOAE A， FLORESCU A， GEORGESCU M I， et al. Ubnormal： new benchmark for supervised open-set video anomaly detection［C］// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2022： 20111-20121.
29	LEE S， KIM H G， RO Y M. STAN： spatio-temporal adversarial networks for abnormal event detection［C］// Proceedings of the 2018 IEEE International Conference on Acoustics， Speech and Signal Processing. Piscataway： IEEE， 2018： 1323-1327.
30	LIU Y， LIU J， YANG K， et al. AMP-net： appearance-motion prototype network assisted automatic video anomaly detection system［J］. IEEE Transactions on Industrial Informatics， 2024， 20（2）： 2843-2855.
31	SUN Z， WANG P， ZHENG W， et al. Dual GroupGAN： an unsupervised four-competitor （2V2） approach for video anomaly detection［J］. Pattern Recognition， 2024， 153： No.110500.
32	SINGH R， SETHI A， SAINI K， et al. Attention-guided generator with dual discriminator GAN for real-time video anomaly detection［J］. Engineering Applications of Artificial Intelligence， 2024， 131： No.107830.
33	LIM B， SON S， KIM H， et al. Enhanced deep residual networks for single image super-resolution［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 1132-1140.
34	VASWANI A， SHAZEER N， PARMAR N， et al. Attention is all you need［C］// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2017： 6000-6010.
35	ISOLA P， ZHU J Y， ZHOU T， et al. Image-to-image translation with conditional adversarial networks［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 5967-5976.
36	JOHNSON J， ALAHI A， LI F F. Perceptual losses for real-time style transfer and super-resolution［C］// Proceedings of the 2016 European Conference Computer Vision， LNCS 9906. Cham： Springer， 2016： 694-711.
37	WANG X， CHE Z， JIANG B， et al. Robust unsupervised video anomaly detection by multipath frame prediction［J］. IEEE Transactions on Neural Networks and Learning Systems， 2022， 33（6）： 2301-2312.
38	MATHTHIEU M， COUPRIE C， LeCUN Y. Deep multi-scale video prediction beyond mean square error［EB/OL］. （2016-02-26）［2024-03-01］..
39	郭方圆，吉根林. 基于双鉴别器和伪视频生成的视频异常检测方法［J］. 计算机科学， 2024， 51（8）： 217-223.
	GUO F Y， JI G L. Video anomaly detection method based on dual discriminators and pseudo video generation［J］. Computer Science， 2024， 51（8）： 217-223.
40	KIM J， GRAUAMN K. Observe locally， infer globally： a space-time MRF for detecting abnormal activities with incremental updates［C］// Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2009： 2921-2928.
41	DEL GIORNO A， BAGNELL J A， HEBERT M. A discriminative framework for anomaly detection in large videos［C］// Proceedings of the 2016 European Conference Computer Vision， LNCS 9909. Cham： Springer， 2016： 334-349.
42	LUO W， LIU W， GAO S. Remembering history with convolutional LSTM for anomaly detection［C］// Proceedings of the 2017 IEEE International Conference on Multimedia and Expo. Piscataway： IEEE， 2017： 439-444.
43	LUO W， LIU W， LIAN D， et al. Video anomaly detection with sparse coding inspired deep neural networks［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2021， 43（3）： 1070-1084.
44	LU Y， YU F， REDDY M K K， et al. Few-shot scene-adaptive anomaly detection［C］// Proceedings of the 2020 European Conference Computer Vision， LNCS 12350. Cham： Springer， 2020： 125-141.
45	LI Q， YANG R， XIAO F， et al. Attention-based anomaly detection in multi-view surveillance videos［J］. Knowledge-Based Systems， 2022， 252： No.109348.
46	NING Z， WANG Z， LIU Y， et al. Memory-enhanced appearance-motion consistency framework for video anomaly detection［J］. Computer Communications， 2024， 216： 159-167.
47	CHANG Y， TU Z， XIE W， et al. Video anomaly detection with spatio-temporal dissociation［J］. Pattern Recognition， 2022， 122： No.108213.
48	TANG Y， ZHAO L， ZHANG S， et al. Integrating prediction and reconstruction for anomaly detection［J］. Pattern Recognition Letters， 2020， 129： 123-130.
49	WU P， LIU J， LI M， et al. Fast sparse coding networks for anomaly detection in videos［J］. Pattern Recognition， 2020， 107： No.107515.
50	LI N， CHANG F， LIU C. Spatial-temporal cascade autoencoder for video anomaly detection in crowded scenes［J］. IEEE Transactions on Multimedia， 2021， 23： 203-215.

方法类型	方法	CUHK Avenue	UCSD Ped1	UCSD Ped2	ShanghaiTech
传统	文献［40］方法	—	59.0	69.3	—
	文献［2］方法	—	81.8	82.9	—
	文献［41］方法	78.3	—	—	—
重构	Conv-AE^［5］	80.0	75.0	85.0	60.9
	ConvLSTM-AE^［42］	77.0	75.5	88.1	—
	sRNN^［3］	81.7	—	92.2	68.0
	MemAE^［21］	83.3	—	94.1	71.2
	MNAD-Recon^［20］	82.8	—	90.2	69.8
	sRNN-AE^［43］	83.5	—	92.2	69.6
预测	文献［8］方法	84.9	83.1	95.4	72.8
	文献［44］方法	85.3	83.7	95.9	73.7
	文献［45］方法	86.0	—	95.4	74.1
	MAMC^［46］	87.6	—	96.7	71.5
重构+ 预测	文献［48］方法	83.7	—	96.2	71.5
重构+ 预测	文献［47］方法	80.6	—	89.4	68.9
NUFP-GAN		87.6	85.2	96.0	73.3

方法类型	方法	CUHK Avenue	UCSD Ped1	UCSD Ped2	ShanghaiTech
传统	文献［40］方法	—	59.0	69.3	—
	文献［2］方法	—	81.8	82.9	—
	文献［41］方法	78.3	—	—	—
重构	Conv-AE^［5］	80.0	75.0	85.0	60.9
	ConvLSTM-AE^［42］	77.0	75.5	88.1	—
	sRNN^［3］	81.7	—	92.2	68.0
	MemAE^［21］	83.3	—	94.1	71.2
	MNAD-Recon^［20］	82.8	—	90.2	69.8
	sRNN-AE^［43］	83.5	—	92.2	69.6
预测	文献［8］方法	84.9	83.1	95.4	72.8
	文献［44］方法	85.3	83.7	95.9	73.7
	文献［45］方法	86.0	—	95.4	74.1
	MAMC^［46］	87.6	—	96.7	71.5
重构+ 预测	文献［48］方法	83.7	—	96.2	71.5
重构+ 预测	文献［47］方法	80.6	—	89.4	68.9
NUFP-GAN		87.6	85.2	96.0	73.3

方法	UCSD Ped1	UCSD Ped2	CUHK Avenue
文献［40］方法	40.0	30.0	—
文献［2］方法	25.0	25.0	—
Conv-AE^［5］	27.9	21.7	25.1
文献［8］方法	—	11.9	22.3
文献［49］方法	25.2	12.5	20.7
ST-CaAE^［50］	15.3	16.7	24.4
文献［45］方法	—	20.0	23.0
NUFP-GAN	21.4	10.2	20.3

方法	UCSD Ped1	UCSD Ped2	CUHK Avenue
文献［40］方法	40.0	30.0	—
文献［2］方法	25.0	25.0	—
Conv-AE^［5］	27.9	21.7	25.1
文献［8］方法	—	11.9	22.3
文献［49］方法	25.2	12.5	20.7
ST-CaAE^［50］	15.3	16.7	24.4
文献［45］方法	—	20.0	23.0
NUFP-GAN	21.4	10.2	20.3

组合方式序号	基准模型	嵌套U型帧预测网络	自注意力补丁判别器	多尺度一致性损失	AUC/%
1	√	×	×	×	83.1
2	×	√	×	×	84.3
3	×	×	√	×	84.1
4	×	×	×	√	83.2
5	×	√	√	×	85.0
6	×	×	√	√	84.6
7	×	√	×	√	84.9
8	×	√	√	√	85.2

Video anomaly detection for moving foreground regions

面向运动前景区域的视频异常检测

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 11

References 50

Related Articles 15

Recommended Articles

Metrics

[1]	Yang ZHOU, Hui LI. Remote sensing image building extraction network based on dual promotion of semantic and detailed features [J]. Journal of Computer Applications, 2025, 45(4): 1310-1316.
[2]	Junxiu AN, Linwang YANG, Yuan LIU. Unsupervised text style transfer based on semantic perception of proximity [J]. Journal of Computer Applications, 2025, 45(4): 1139-1147.
[3]	Yiding WANG, Zehao WANG, Yaoli LI, Shaoqing CAI, Yuan YUAN. Multi-scale 2D-Adaboost microscopic image recognition algorithm of Chinese medicinal materials powder [J]. Journal of Computer Applications, 2025, 45(4): 1325-1332.
[4]	Ruilong CHEN, Tao HU, Youjun BU, Peng YI, Xianjun HU, Wei QIAO. Stacking ensemble adversarial defense method for encrypted malicious traffic detection model [J]. Journal of Computer Applications, 2025, 45(3): 864-871.
[5]	Zhenhua XUE, Qiang LI, Chao HUANG. Vision foundation model-driven pixel-level image anomaly detection method [J]. Journal of Computer Applications, 2025, 45(3): 823-831.
[6]	Yan LI, Guanhua YE, Yawen LI, Meiyu LIANG. Enterprise ESG indicator prediction model based on richness coordination technology [J]. Journal of Computer Applications, 2025, 45(2): 670-676.
[7]	Hong SHANGGUAN, Huiying REN, Xiong ZHANG, Xinglong HAN, Zhiguo GUI, Yanling WANG. Low-dose CT denoising model based on dual encoder-decoder generative adversarial network [J]. Journal of Computer Applications, 2025, 45(2): 624-632.
[8]	Zirong HONG, Guangqing BAO. Review of radar automatic target recognition based on ensemble learning [J]. Journal of Computer Applications, 2025, 45(2): 371-382.
[9]	Zhongwei ZHANG, Jun WANG, Shudong LIU, Zhiheng WANG. Object detection in remote sensing image based on multi-scale feature fusion and weighted boxes fusion [J]. Journal of Computer Applications, 2025, 45(2): 633-639.
[10]	Miaolei DENG, Yupei KAN, Chuanchuan SUN, Haihang XU, Shaojun FAN, Xin ZHOU. Summary of network intrusion detection systems based on deep learning [J]. Journal of Computer Applications, 2025, 45(2): 453-466.
[11]	Songsen YU, Zhifan LIN, Guopeng XUE, Jianyu XU. Lightweight large-format tile defect detection algorithm based on improved YOLOv8 [J]. Journal of Computer Applications, 2025, 45(2): 647-654.
[12]	Danni DING, Bo PENG, Xi WU. VPNet： fatty liver ultrasound image classification method inspired by ventral pathway [J]. Journal of Computer Applications, 2025, 45(2): 662-669.
[13]	Tianqi ZHANG, Shuang TAN, Xiwen SHEN, Juan TANG. Image watermarking method combining attention mechanism and multi-scale feature [J]. Journal of Computer Applications, 2025, 45(2): 616-623.
[14]	Xinran XU, Shaobing ZHANG, Miao CHENG, Yang ZHANG, Shang ZENG. Bearings fault diagnosis method based on multi-pathed hierarchical mixture-of-experts model [J]. Journal of Computer Applications, 2025, 45(1): 59-68.
[15]	Jietao LIANG, Bing LUO, Lanhui FU, Qingling CHANG, Nannan LI, Ningbo YI, Qi FENG, Xin HE, Fuqin DENG. Point cloud registration method based on coordinate geometric sampling [J]. Journal of Computer Applications, 2025, 45(1): 214-222.