基于重构误差的无监督人脸伪造视频检测

doi:10.11772/j.issn.1001-9081.2022040568

《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (5): 1571-1577.DOI: 10.11772/j.issn.1001-9081.2022040568

所属专题：多媒体计算与计算机仿真

• 多媒体计算与计算机仿真 • 上一篇下一篇

基于重构误差的无监督人脸伪造视频检测

许喆, 王志宏, 单存宇, 孙亚茹, 杨莹()

公安部第三研究所网络空间安全技术研发基地，上海 200031

收稿日期:2022-04-24 修回日期:2022-06-17 接受日期:2022-06-17 发布日期:2022-07-11 出版日期:2023-05-10
通讯作者: 杨莹
作者简介:许喆（1993—），男，安徽滁州人，研究实习员，硕士，CCF会员，主要研究方向：自然语言处理、时序异常检测、人脸伪造检测
王志宏（1990—），男，江苏泰兴人，助理研究员，博士，CCF会员，主要研究方向：自然语言处理、事件挖掘、网络公害治理
单存宇（1993—），男，江苏大丰人，研究实习员，硕士，CCF会员，主要研究方向：图像识别、数据挖掘
孙亚茹（1993—），女，山东菏泽人，研究实习员，硕士，CCF会员，主要研究方向：自然语言处理、数据挖掘
杨莹（1981—），女，河南商丘人，副研究员，博士，CCF会员，主要研究方向：大数据分析、信息安全。yangying@mcst.org.cn
基金资助:
国家重点研发计划项目(2021YFB3101405)

Unsupervised face forgery video detection based on reconstruction error

Zhe XU, Zhihong WANG, Cunyu SHAN, Yaru SUN, Ying YANG()

Research and Development Base of Cyberspace Security Technology，The Third Research Institute of The Ministry of Public Security，Shanghai 200031，China

Received:2022-04-24 Revised:2022-06-17 Accepted:2022-06-17 Online:2022-07-11 Published:2023-05-10
Contact: Ying YANG
About author:XU Zhe， born in 1993， M. S.， research probationer. His research interests include natural language processing， temporal anomaly detection， face forgery detection.
WANG Zhihong， born in 1990， Ph. D.， assistant research fellow. His research interests include natural language processing， event mining， network public hazards management.
SHAN Cunyu， born in 1993， M. S.， research probationer. His research interests include image recognition， data mining.
SUN Yaru， born in 1993， M. S.， research probationer. Her research interests include natural language processing， data mining.
YANG Ying， born in 1981， Ph. D.， associate research fellow. Her research interests include big data analysis， information security.
Supported by:
National Key Research and Development Program of China(2021YFB3101405)

摘要/Abstract

摘要：

目前有监督的人脸伪造视频检测方法需要大量标注数据。为解决视频伪造方法迭代快、种类多等现实问题，将时序异常检测中的无监督思想引入人脸伪造视频检测，将伪造视频检测任务转为无监督的视频异常检测任务，提出一种基于重构误差的无监督人脸伪造视频检测模型。首先，抽取待检测视频中连续帧的人脸特征点序列；其次，基于偏移特征、局部特征、时序特征等多粒度信息对待检测视频中人脸特征点序列进行重构；然后，计算原始序列与重构序列之间的重构误差；最后，根据重构误差的波峰频率计算得分对伪造视频进行自动检测。实验结果表明，在FaceShifter、FaceSwap等人脸视频伪造方法上，与LRNet （Landmark Recurrent Network）、Xception-c23等检测方法相比，所提方法的检测性能的曲线下方面积（AUC）最多增加了27.6%，移植性能的AUC最多增加了30.4%。

关键词: 人脸伪造检测, 无监督学习, 时序异常检测, 生成模型, 人脸特征点

Abstract:

The current supervised face forgery video detection methods need a large amount of labeled data. In order to solve the practical problems of fast iteration and many kinds of video forgery methods， the unsupervised idea in temporal anomaly detection was introduced into face forgery video detection， the face forgery video detection task was transformed into unsupervised video anomaly detection task， and an unsupervised face forgery video detection method based on reconstruction error was proposed. Firstly， the facial landmark sequence of continuous frames in the video to be detected was extracted. Secondly， the facial landmark sequence in the video to be detected was reconstructed based on multi-granularity information such as deviation features， local features and temporal features. Thirdly， the reconstruction error between the original sequence and the reconstructed sequence was calculated. Finally， the score was calculated according to the peak frequency of the reconstruction error to detect the forgery video automatically. Experimental results show that compared with detection methods such as LRNet （Landmark Recurrent Network） and Xception-c23， the proposed method has the AUC （Area Under Curve） of the detection performance increased by up to 27.6%， and the AUC of the transplantation performance increased by 30.4%.

Key words: face forgery detection, unsupervised learning, temporal anomaly detection, generative model, facial landmark

中图分类号:

TP391.4

许喆, 王志宏, 单存宇, 孙亚茹, 杨莹. 基于重构误差的无监督人脸伪造视频检测[J]. 计算机应用, 2023, 43(5): 1571-1577.

Zhe XU, Zhihong WANG, Cunyu SHAN, Yaru SUN, Ying YANG. Unsupervised face forgery video detection based on reconstruction error[J]. Journal of Computer Applications, 2023, 43(5): 1571-1577.

图/表 7

参考文献 33

1	MATERN F， RIESS C， STAMMINGER M. Exploiting visual artifacts to expose Deepfakes and face manipulations［C］// Proceedings of the 2019 IEEE Winter Applications of Computer Vision Workshops. Piscataway： IEEE 2019：83-92. 10.1109/wacvw.2019.00020
2	AFCHAR D， NOZICK V， YAMAGISHI J， et al. MesoNet： a compact facial video forgery detection network［C］// Proceedings of the 2018 IEEE International Workshop on Information Forensics and Security. Piscataway： IEEE， 2018：1-7. 10.1109/wifs.2018.8630761
3	QIAN Y Y， YIN G J， SHENG L， et al. Thinking in frequency： face forgery detection by mining frequency-aware clues［C］// Proceedings of the 2020 European Conference on Computer Vision， LNCS 12357. Cham： Springer， 2020： 86-103. 10.1007/978-3-030-58610-2_6
4	LI L Z， BAO J M， ZHANG T， et al. Face X-ray for more general face forgery detection［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 5000-5009. 10.1109/cvpr42600.2020.00505
5	汤桂花，孙磊，毛秀青，等. 基于深度对齐网络的生成对抗网络伪造人脸检测［J］. 计算机应用， 2021， 41（7）：1922-1927. 10.11772/j.issn.1001-9081.2020081214
	TANG G H， SUN L， MAO X Q， et al. Generative adversarial network synthesized face detection based on deep alignment network［J］. Journal of Computer Applications， 2021， 41（7）：1922-1927. 10.11772/j.issn.1001-9081.2020081214
6	翁泽佳，陈静静，姜育刚. 基于域对抗学习的可泛化虚假人脸检测方法研究［J］. 计算机研究与发展， 2021， 58（7）： 1476-1489. 10.7544/issn1000-1239.2021.20200803
	WENG Z J， CHEN J J， JIANG Y G. On the generalization of face forgery detection with domain adversarial learning［J］. Journal of Computer Research and Development， 2021， 58（7）： 1476-1489. 10.7544/issn1000-1239.2021.20200803
7	李旭嵘，于鲲. 一种基于双流网络的Deepfakes检测技术［J］. 信息安全学报， 2020， 5（2）：84-91.
	LI X R， YU K. A Deepfakes detection technique based on two-stream network［J］. Journal of Cyber Security， 2020， 5（2）：84-91.
8	LI Y Z， CHANG M C， LYU S W. In ictu oculi： exposing AI generated fake face videos by detecting eye blinking［C］// Proceedings of the 2018 IEEE International Workshop on Information Forensics and Security. Piscataway： IEEE， 2018： 1-7. 10.1109/wifs.2018.8630787
9	YANG X， LI Y Z， LYU S W. Exposing Deep Fakes using inconsistent head poses［C］// Proceedings of the 2019 IEEE International Conference on Acoustics， Speech and Signal Processing. Piscataway： IEEE， 2019： 8261-8265. 10.1109/icassp.2019.8683164
10	SUN Z K， HAN Y J， HUA Z Y， et al. Improving the efficiency and robustness of Deepfakes detection through precise geometric features［C］// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2021：3608-3617. 10.1109/cvpr46437.2021.00361
11	GÜERA D， DELP E J. Deepfake video detection using recurrent neural networks［C］// Proceedings of the 15th IEEE International Conference on Advanced Video and Signal Based Surveillance. Piscataway： IEEE， 2018： 1-6. 10.1109/avss.2018.8639163
12	SABIR E， CHENG J X， JAISWAL A， et al. Recurrent convolutional strategies for face manipulation detection in videos［J］. Interfaces， 2019， 3（1）： 80-87.
13	GU Z H， CHEN Y， YAO T P， et al. Spatiotemporal inconsistency learning for DeepFake video detection［C］// Proceedings of the 29th ACM International Conference on Multimedia. New York： ACM， 2021： 3473-3481. 10.1145/3474085.3475508
14	PARK D， KIM H， HOSHI Y， et al. A multimodal execution monitor with anomaly classification for robot-assisted feeding［C］// Proceedings of the 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway： IEEE， 2017： 5406-5413. 10.1109/iros.2017.8206437
15	RODRIGUEZ A， BOURNE D， MASON M， et al. Failure detection in assembly： force signature analysis［C］// Proceedings of the 2010 IEEE International Conference on Automation Science and Engineering. Piscataway： IEEE， 2010： 210-215. 10.1109/coase.2010.5584452
16	HUNDMAN K， CONSTANTINOU V， LAPORTE C， et al. Detecting spacecraft anomalies using LSTMs and nonparametric dynamic thresholding［C］// Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York： ACM， 2018：387-395. 10.1145/3219819.3219845
17	MALHOTRA P， RAMAKRISHNAN A， ANAND G， et al. LSTM-based encoder-decoder for multi-sensor anomaly detection［EB/OL］. （2016-07-11）［2022-06-13］..
18	ZONG B， SONG Q， MIN M R， et al. Deep autoencoding gaussian mixture model for unsupervised anomaly detection［EB/OL］. ［2022-02-13］..
19	PARK D， HOSHI Y， KEMP C C. A multimodal anomaly detector for robot-assisted feeding using an LSTM-based variational autoencoder［J］. IEEE Robotics and Automation Letters， 2018， 3（3）： 1544-1551. 10.1109/lra.2018.2801475
20	DU M， LI F F， ZHENG G N， et al. DeepLog： anomaly detection and diagnosis from system logs through deep learning［C］// Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. New York： ACM， 2017： 1285-1298. 10.1145/3133956.3134015
21	SU Y， ZHAO Y J， NIU C H， et al. Robust anomaly detection for multivariate time series through stochastic recurrent neural network［C］// Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York： ACM， 2019： 2828-2837. 10.1145/3292500.3330672
22	KING D E. Dlib-ml： a machine learning toolkit［J］. Journal of Machine Learning Research， 2009， 10：1755-1758.
23	BAKER S， MATTHEWS I. Lucas-Kanade 20 years on： a unifying framework［J］. International Journal of Computer Vision， 2004， 56（3）：221-255. 10.1023/b:visi.0000011205.11775.fd
24	KALMAN R E. A new approach to linear filtering and prediction problems［J］. Journal of Basic Engineering， 1960， 82（1）：35-45. 10.1115/1.3662552
25	KIM Y. Convolutional neural network for sentence classification［C］// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg， PA： ACL， 2014：1746-1751. 10.3115/v1/d14-1181
26	RÖSSLER A， COZZOLINO D， VERDOLIVA L， et al. FaceForensics++： learning to detect manipulated facial images［C］// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2019： 1-11. 10.1109/iccv.2019.00009
27	LI Y Z， YANG X， SUN P， et al. Celeb-DF： a large-scale challenging dataset for DeepFake forensics［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 3204-3213. 10.1109/cvpr42600.2020.00327
28	ZHOU P， HAN X T， MORARIU V I， et al. Two-stream neural networks for tampered face detection［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops. Piscataway： IEEE， 2017： 1831-1839. 10.1109/cvprw.2017.229
29	YUEZUN L， SIWEI L. Exposing DeepFake videos by detecting face warping artifacts［C］// Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition Workshops. Piscataway： IEEE， 2019： 46-52.
30	NGUYEN H H， YAMAGISHI J， ECHIZEN I. Use of a capsule network to detect fake images and videos［EB/OL］. （2019-12-29）［2022-04-13］.. 10.1109/icassp.2019.8682602
31	CHOLLET F. Xception： deep learning with depthwise separable convolutions［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE 2017：1800-1807. 10.1109/cvpr.2017.195
32	EKRAAM S， JIAXIN C， AYUSH J， et al. Recurrent convolutional strategies for face manipulation detection in videos［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops， Piscataway： IEEE 2019： 80-87. 10.1109/cvprw47913.2019
33	WANG L M， XIONG Y J， WANG Z， et al. Temporal segment networks： towards good practices for deep action recognition［C］// Proceedings of the 2016 European Conference on Computer Vision， LNCS 9912. Cham： Springer， 2016： 20-36.

模型	Deepfake	Face2Face	FaceShifter	FaceSwap	NeuralTextures
LRNet（DF）	0.964 3	0.678 2	0.653 0	0.757 1	0.615 3
LRNet（NT）	0.778 2	0.944 9	0.589 8	0.677 4	0.920 9
CNN-GRU-VAE	0.914 4	0.632 3	0.752 6	0.837 4	0.525 7

模型	Deepfake	Face2Face	FaceShifter	FaceSwap	NeuralTextures
LRNet（DF）	0.964 3	0.678 2	0.653 0	0.757 1	0.615 3
LRNet（NT）	0.778 2	0.944 9	0.589 8	0.677 4	0.920 9
CNN-GRU-VAE	0.914 4	0.632 3	0.752 6	0.837 4	0.525 7

模型	FaceForensic++	Celeb-DF
Two-stream	0.701	0.538
Meso4	0.847	0.548
MesoInception4	0.830	0.536
FWA	0.801	0.569
DSP-FWA	0.930	0.646
Xception-c23	0.997	0.653
Capsule	0.966	0.575
LRNet	0.964	0.569
CNN-GRU-VAE	0.914	0.606

模型	FaceForensic++	Celeb-DF
Two-stream	0.701	0.538
Meso4	0.847	0.548
MesoInception4	0.830	0.536
FWA	0.801	0.569
DSP-FWA	0.930	0.646
Xception-c23	0.997	0.653
Capsule	0.966	0.575
LRNet	0.964	0.569
CNN-GRU-VAE	0.914	0.606

模型	显存占用/GB	硬盘占用/GB	训练时间/h
Xception	12	64	21
X-Ray	>12	>180	>30
CNN+RNN	9	64	22.5
TSN	>12	>120	>30
LRNet	3	1.1	0.2
CNN-GRU-VAE	1.4	1.1	0.1

基于重构误差的无监督人脸伪造视频检测

Unsupervised face forgery video detection based on reconstruction error

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 7

参考文献 33

相关文章 15

编辑推荐

Metrics

[1]	贾洁茹, 杨建超, 张硕蕊, 闫涛, 陈斌. 基于自蒸馏视觉Transformer的无监督行人重识别[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2893-2902.
[2]	夏吾吉, 黄鹤鸣, 更藏措毛, 范玉涛. 基于无监督学习和监督学习的抽取式文本摘要综述[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1035-1048.
[3]	江锐, 刘威, 陈成, 卢涛. 非对称端到端的无监督图像去雨网络[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 922-930.
[4]	胡能兵, 蔡彪, 李旭, 曹旦华. 基于图池化对比学习的图分类方法[J]. 《计算机应用》唯一官方网站, 2024, 44(11): 3327-3334.
[5]	赵培, 乔焰, 胡荣耀, 袁新宇, 李敏悦, 张本初. 基于多域特征提取的多变量时间序列异常检测[J]. 《计算机应用》唯一官方网站, 2024, 44(11): 3419-3426.
[6]	彭庆媛, 王晓峰, 王军霞, 华盈盈, 唐傲, 何飞. 可满足性问题相变研究综述[J]. 《计算机应用》唯一官方网站, 2024, 44(11): 3503-3512.
[7]	黄梦林, 段磊, 张袁昊, 王培妍, 李仁昊. 基于Prompt学习的无监督关系抽取模型[J]. 《计算机应用》唯一官方网站, 2023, 43(7): 2010-2016.
[8]	葛孟婷, 万鸣华. 基于近邻监督局部不变鲁棒主成分分析的特征提取模型[J]. 《计算机应用》唯一官方网站, 2023, 43(4): 1013-1020.
[9]	李文博, 刘波, 陶玲玲, 罗棻, 张航. L1正则化的深度谱聚类算法[J]. 《计算机应用》唯一官方网站, 2023, 43(12): 3662-3667.
[10]	郑赛, 李天瑞, 黄维. 面向通信成本优化的联邦学习算法[J]. 《计算机应用》唯一官方网站, 2023, 43(1): 1-7.
[11]	代少升, 熊昆, 吴云铎, 肖佳伟. 多视角约束级联回归的视频人脸特征点跟踪[J]. 《计算机应用》唯一官方网站, 2022, 42(8): 2415-2422.
[12]	郭一阳, 于炯, 杜旭升, 杨少智, 曹铭. 基于自编码器与集成学习的离群点检测算法[J]. 《计算机应用》唯一官方网站, 2022, 42(7): 2078-2087.
[13]	刘睿珩, 叶霞, 岳增营. 面向自然语言处理任务的预训练模型综述[J]. 《计算机应用》唯一官方网站, 2021, 41(5): 1236-1246.
[14]	李旭娟, 皮建勇, 黄飞翔, 贾海朋. 基于自生成深度神经网络的4D航迹预测[J]. 计算机应用, 2021, 41(5): 1492-1499.
[15]	郭一村, 陈华辉. 在线哈希算法研究综述[J]. 《计算机应用》唯一官方网站, 2021, 41(4): 1106-1112.

模型	Deepfake	Face2Face	FaceShifter	FaceSwap	NeuralTexture	Celeb-DF
CNN-GRU-VAE	0.914 4	0.632 3	0.752 6	0.837 4	0.525 7	0.606 3
不使用偏移特征	-0.059 1	+0.002 5	-0.060 2	-0.061 8	+0.040 5	-0.045 0
GRU-VAE	-0.037 2	-0.001 4	-0.047 7	-0.004 2	-0.004 8	-0.025 4
CNN-GRU-AE	-0.024 8	-0.043 1	-0.000 3	-0.007 0	+0.037 3	-0.013 6

模型	Deepfake	Face2Face	FaceShifter	FaceSwap	NeuralTexture	Celeb-DF
CNN-GRU-VAE	0.914 4	0.632 3	0.752 6	0.837 4	0.525 7	0.606 3
不使用偏移特征	-0.059 1	+0.002 5	-0.060 2	-0.061 8	+0.040 5	-0.045 0
GRU-VAE	-0.037 2	-0.001 4	-0.047 7	-0.004 2	-0.004 8	-0.025 4
CNN-GRU-AE	-0.024 8	-0.043 1	-0.000 3	-0.007 0	+0.037 3	-0.013 6