Lightweight image tamper localization algorithm based on large kernel attention convolution

doi:10.11772/j.issn.1001-9081.2022091405

Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (9): 2692-2699.DOI: 10.11772/j.issn.1001-9081.2022091405

• 2022 10th CCF Conference on Big Data • Previous Articles Next Articles

Lightweight image tamper localization algorithm based on large kernel attention convolution

Hong WANG, Qing QIAN(), Huan WANG, Yong LONG

School of Information，Guizhou University of Finance and Economics，Guiyang Guizhou 550025，China

Received:2022-09-19 Revised:2022-10-18 Accepted:2022-10-21 Online:2023-09-10 Published:2023-09-10
Contact: Qing QIAN
About author:WANG Hong， born in 1995， M. S. candidate. His research interests include artificial intelligence， passive image forensics.
WANG Huan， born in 1987， Ph. D.， lecturer. Her research interests include image forensics， multimedia security.
LONG Yong， born in 1997， M. S. candidate. Her research interests include multimedia forensics， image steganography.
Supported by:
National Natural Science Foundation of China(61902085);Science and Technology Program of Guizhou Province （QianKeHeJiChu-ZK［2021］General 311）, Natural Science Research Project of Department of Education of Guizhou Province （QianJiaoHe KY ［2021］136）

融合大核注意力卷积的轻量化图像篡改定位算法

王宏, 钱清(), 王欢, 龙永

贵州财经大学信息学院，贵阳 550025

通讯作者: 钱清
作者简介:王宏（1995—），男，四川南充人，硕士研究生，CCF会员，主要研究方向：人工智能、图像被动取证
王欢（1987—），女，重庆人，讲师，博士，主要研究方向：图像取证、多媒体安全
龙永（1997—），女，四川宜宾人，硕士研究生，CCF会员，主要研究方向：多媒体取证、图像隐写。
基金资助:
国家自然科学基金资助项目(61902085);贵州省科技计划项目（黔科合基础-ZK［2021］一般311）;贵州省教育厅自然科学研究项目(黔教合KY字［2021］136)

Abstract

Abstract:

Convolutional Neural Networks （CNN） are used for image forensics because of their high recognizable property， easy understanding， and strong learnability. However， their inherent disadvantages of the receptive field increasing slowly and neglecting long-range dependencies， and high computational cost cause the unsatisfactory accuracy and lightweight deployment of deep learning algorithms. To solve the above problems， a lightweight network-based image copy-paste tamper detection algorithm namely LKA-EfficientNet （Large Kernel Attention EfficientNet） was proposed. The characteristics of long-range dependencies and global receptive field were contained in LKA-EfficientNet， and the number of EfficientNetV2 parameters was optimized. As a result， the localization speed and detection accuracy of image tamper were improved. Firstly， the image was inputted into and processed in the backbone network based on Large Kernel Attention （LKA） to obtain the candidate feature maps. Then， the feature maps of different scales were used to construct the feature pyramid for feature matching. Finally， the candidate feature maps after feature matching were fused to locate the tampered area of the image. In addition， the triple cross entropy loss function was used by LKA-EfficientNet to further improve the accuracy of the algorithm in image tamper localization. Experimental results show that LKA-EfficientNet can not only reduce the floating-point operations by 29.54% but also increase the F1 by 4.88% compared to the same type algorithm — Dense-InceptionNet. The above verifies that LKA-EfficientNet can reduce computational cost and maintain high detection performance at the same time.

Key words: image tamper detection, lightweight network, attention mechanism, multi-scale feature pyramid, passive forensics

摘要：

卷积神经网络（CNN）因辨识度高、易于理解、可学习性强而被用于图像取证，但它固有的感受野增加缓慢、忽略长端依赖性、计算量庞大等缺点导致深度学习算法的精度与轻量化部署效果并不理想，不适用于以轻量化形式实现图像篡改定位的场景。为解决上述问题，提出一种基于轻量化网络的图像复制-粘贴篡改检测算法——LKA-EfficientNet（Large Kernel Attention EfficientNet）。LKA-EfficientNet具有长端依赖性和全局感受野的特性，且优化了EfficientNetV2的参数量，提高了图像篡改定位速度和精度。首先，将输入图像通过基于大核注意力（LKA）卷积的基干网络进行处理，得到候选特征图；随后，使用不同尺寸的特征图构建特征金字塔进行特征匹配；最后，将特征匹配后的特征图进行融合以定位图像篡改区域；此外，LKA-EfficientNet使用三元组交叉熵损失函数进一步提升了算法定位篡改图像的精度。实验结果表明，LKA-EfficientNet与同类型的Dense-InceptionNet算法相比，不仅能够降低29.54%的浮点运算量，而且F1分数也提高了4.88%，验证了LKA-EfficientNet可以在保持高检测性能的同时降低计算量。

关键词: 图像篡改检测, 轻量化网络, 注意力机制, 多尺度特征金字塔, 被动取证

CLC Number:

TP751

Hong WANG, Qing QIAN, Huan WANG, Yong LONG. Lightweight image tamper localization algorithm based on large kernel attention convolution[J]. Journal of Computer Applications, 2023, 43(9): 2692-2699.

王宏, 钱清, 王欢, 龙永. 融合大核注意力卷积的轻量化图像篡改定位算法[J]. 《计算机应用》唯一官方网站, 2023, 43(9): 2692-2699.

Figures/Tables 15

References 35

1	田秀霞，李华强，张琴，等. 基于双通道R-FCN的图像篡改检测模型［J］. 计算机学报， 2021， 44（2）：370-383. 10.11897/SP.J.1016.2021.00370
	TIAN X X， LI H Q， ZHANG Q， et al. Dual-channel R-FCN model for image forgery detection［J］. Chinese Journal of Computers， 2021， 44（2）：370-383. 10.11897/SP.J.1016.2021.00370
2	GUO M H， LU C Z， LIU Z N， et al. Visual attention network［EB/OL］. （2022-07-11）［2022-07-01］..
3	ALOM M Z， TAHA T M， YAKOPCIC C， et al. A state-of-the-art survey on deep learning theory and architectures［J］. Electronics， 2019， 8（3）： No.292. 10.3390/electronics8030292
4	DOSOVITSKIY A， BEYER L， KOLESNIKOV A， et al. An image is worth 16x16 words： Transformers for image recognition at scale［EB/OL］. （2021-06-03）［2022-07-01］..
5	MEHTA S， RASTEGARI M. MobileViT： light-weight， general-purpose， and mobile-friendly vision transformer［EB/OL］. （2022-03-04）［2022-07-01］.. 10.1109/cvpr.2019.00941
6	TAN M X， LE Q V. EfficientNetV2： smaller models and faster training［C］// Proceedings of the 38th International Conference on Machine Learning. New York： JMLR.org， 2021： 10096-10106.
7	FRIDRICH J， SOUKAL D， LUKÁŠ J. Detection of copy-move forgery in digital images［EB/OL］. ［2022-06-11］..
8	COZZOLINO D， POGGI G， VERDOLIVA L. Efficient dense-field copy-move forgery detection［J］. IEEE Transactions on Information Forensics and Security， 2015， 10（11）： 2284-2297. 10.1109/tifs.2015.2455334
9	POPESCU A C， FARID H. Exposing digital forgeries by detecting duplicated image regions： TR2004-515［R/OL］. （2004-08-01）［2022-04-11］.. 10.1109/tsp.2004.839932
10	RUBLEE E， RABAUD V， KONOLIGE K， et al. ORB： an efficient alternative to SIFT or SURF［C］// Proceedings of the 2011 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2011： 2564-2571. 10.1109/iccv.2011.6126544
11	TAREEN S A K， SALEEM Z. A comparative analysis of SIFT， SURF， KAZE， AKAZE， ORB， and BRISK［C］// Proceedings of the 2018 International Conference on Computing， Mathematics and Engineering Technologies. Piscataway： IEEE， 2018： 1-10. 10.1109/icomet.2018.8346440
12	RAO Y， NI J Q. A deep learning approach to detection of splicing and copy-move forgeries in images［C］// Proceedings of the 2016 IEEE International Workshop on Information Forensics and Security. Piscataway： IEEE， 2016： 1-6. 10.1109/wifs.2016.7823911
13	WU Y， ABD-ALMAGEED W， NATARAJAN P. BusterNet： detecting copy-move image forgery with source/target localization［C］// Proceedings of the 2018 European Conference on Computer Vision， LNCS 11210. Cham： Springer， 2018： 170-186.
14	CHEN B J， TAN W J， COATRIEUX G， et al. A serial image copy-move forgery localization scheme with source/target distinguishment［J］. IEEE Transactions on Multimedia， 2021， 23： 3506-3517. 10.1109/tmm.2020.3026868
15	ZHOU P， HAN X T， MORARIU V I， et al. Learning rich features for image manipulation detection［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 1053-1061. 10.1109/cvpr.2018.00116
16	WU Y， AbdALMAGEED W， NATARAJAN P. ManTra-Net： manipulation tracing network for detection and localization of image forgeries with anomalous features［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2019： 9535-9544. 10.1109/cvpr.2019.00977
17	徐代，岳璋，杨文霞，等. 基于改进的三向流Faster R-CNN的篡改图像识别［J］. 计算机应用， 2020， 40（5）：1315-1321. 10.11772/j.issn.1001-9081.2019081515
	XU D， YUE Z， YANG W X， et al. Tampered image recognition based on improved three-stream Faster R-CNN［J］. Journal of Computer Applications， 2020， 40（5）：1315-1321. 10.11772/j.issn.1001-9081.2019081515
18	ZHONG J L， PUN C M. An end-to-end Dense-InceptionNet for image copy-move forgery detection［J］. IEEE Transactions on Information Forensics and Security， 2020， 15： 2134-2146. 10.1109/tifs.2019.2957693
19	吴旭，刘翔，赵静文. 一种轻量级多尺度融合的图像篡改检测算法［J］.计算机工程， 2022， 48（2）：224-229， 236. 10.19678/j.issn.1000-3428.0060066
	WU X， LIU X， ZHAO J W. A lightweight multiscale fusion algorithm for image tampering detection［J］. Computer Engineering， 2022， 48（2）：224-229， 236. 10.19678/j.issn.1000-3428.0060066
20	BARNI M， PHAN Q T， TONDI B. Copy move source-target disambiguation through multi-branch CNNs［J］. IEEE Transactions on Information Forensics and Security， 2021， 16： 1825-1840. 10.1109/tifs.2020.3045903
21	LIN T Y， DOLLÁR P， GIRSHICK R， et al. Feature pyramid networks for object detection［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 936-944. 10.1109/cvpr.2017.106
22	SCHROFF F， KALENICHENKO D， PHILBIN J. FaceNet： a unified embedding for face recognition and clustering［C］// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2015： 815-823. 10.1109/cvpr.2015.7298682
23	BEIS J S， LOWE D G. Shape indexing using approximate nearest-neighbour search in high-dimensional spaces［C］// Proceedings of the 1997 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 1997： 1000-1006.
24	DONG J， WANG W， TAN T N. CASIA image tampering detection evaluation database［C］// Proceedings of the 2013 IEEE China Summit and International Conference on Signal and Information Processing. Piscataway： IEEE， 2013： 422-426. 10.1109/chinasip.2013.6625374
25	TRALIC D， ZUPANCIC I， GRGIC S， et al. CoMoFoD — new database for copy-move forgery detection［C］// Proceedings of the 2013 International Symposium on Electronics in Marine. Piscataway： IEEE， 2013： 49-54.
26	ARDIZZONE E， BRUNO A， MAZZOLA G. Copy-move forgery detection by matching triangles of keypoints［J］. IEEE Transactions on Information Forensics and Security， 2015， 10（10）： 2084-2094. 10.1109/tifs.2015.2445742
27	AMERINI I， BALLAN L， CALDELLI R， et al. A SIFT-based forensic method for copy-move attack detection and transformation recovery［J］. IEEE Transactions on Information Forensics and Security， 2011， 6（3）： 1099-1110. 10.1109/tifs.2011.2129512
28	WEN B H， ZHU Y， SUBRAMANIAN R， et al. COVERAGE — a novel database for copy-move forgery detection［C］// Proceedings of the 2016 IEEE International Conference on Image Processing. Piscataway： IEEE， 2016： 161-165. 10.1109/icip.2016.7532339
29	AMERINI I， BALLAN L， CALDELLI R， et al. Copy-move forgery detection and localization by means of robust clustering with J-Linkage［J］. Signal Processing： Image Communication， 2013， 28（6）： 659-669. 10.1016/j.image.2013.03.006
30	GOUTTE C， GAUSSIER E. A probabilistic interpretation of precision， recall and F-score， with implication for evaluation［C］// Proceedings of the 2005 European Conference on Information Retrieval， LNCS 3408. Berlin： Springer， 2005： 345-359.
31	MOLCHANOV P， TYREE S， KARRAS T， et al. Pruning convolutional neural networks for resource efficient inference［EB/OL］. （2017-06-08）［2022-07-01］..
32	HE K M， ZHANG X Y， REN S Q， et al. Deep residual learning for image recognition［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 770-778. 10.1109/cvpr.2016.90
33	ZHANG X Y， ZHOU X Y， LIN M X， et al. ShuffleNet： an extremely efficient convolutional neural network for mobile devices［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 6848-6856. 10.1109/cvpr.2018.00716
34	RADOSAVOVIC I， KOSARAJU R P， GIRSHICK R， et al. Designing network design spaces［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 10425-10433. 10.1109/cvpr42600.2020.01044
35	MARRA F， GRAGNANIELLO D， VERDOLIVA L， et al. A full-image full-resolution end-to-end-trainable CNN framework for image forgery detection［J］. IEEE Access， 2020， 8： 133488-133502. 10.1109/access.2020.3009877

模块序号	模块名称	步距	输出通道	重复层数
0	Conv3×3	2	24	1
1	Fused-MBConv1，K=3×3	1	24	2
2	Fused-MBConv4，K=3×3	2	48	4
3	Fused-MBConv4，K=3×3	2	64	4
4	MBConv4，K3×3，SE=0.25	2	128	6
5	MBConv6，K3×3，SE=0.25	1	160	9
6	MBConv6，K3×3，SE=0.25	2	256	15

模块序号	模块名称	步距	输出通道	重复层数
0	Conv3×3	2	24	1
1	Fused-MBConv1，K=3×3	1	24	2
2	Fused-MBConv4，K=3×3	2	48	4
3	Fused-MBConv4，K=3×3	2	64	4
4	MBConv4，K3×3，SE=0.25	2	128	6
5	MBConv6，K3×3，SE=0.25	1	160	9
6	MBConv6，K3×3，SE=0.25	2	256	15

模块序号	模块名称	步距	输出通道	重复层数
0	LKA	2	12	1
1	Fused-MBConv1，K=3×3	1	24	2
2	Fused-MBConv4，K=3×3	2	36	4
3	Fused-MBConv4，K=3×3	2	48	4
4	MBConv4，K，SE=0.25	2	92	6
5	MBConv6，K，SE=0.25	1	128	3
6	MBConv6，K，SE=0.25	2	192	5

模块序号	模块名称	步距	输出通道	重复层数
0	LKA	2	12	1
1	Fused-MBConv1，K=3×3	1	24	2
2	Fused-MBConv4，K=3×3	2	36	4
3	Fused-MBConv4，K=3×3	2	48	4
4	MBConv4，K，SE=0.25	2	92	6
5	MBConv6，K，SE=0.25	1	128	3
6	MBConv6，K，SE=0.25	2	192	5

层数	不同基干网络的精度/%
层数	EfficinetNetV2	EfficientNetV2+LKA
16	79.7	80.0
18	82.0	82.2
20	84.8	84.2
22	85.3	84.9
24	82.6	85.7
26	82.7	88.3
28	80.5	88.0
30	81.1	87.8
32	80.0	88.0
34	80.5	88.4
36	80.2	88.3

Lightweight image tamper localization algorithm based on large kernel attention convolution

融合大核注意力卷积的轻量化图像篡改定位算法

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 15

References 35

Related Articles 15

Recommended Articles

Metrics

网络	不同层数的精度/%
网络	16层	20层	24层	28层
ResNet	81.2	82.7	84.3	83.9
ResNet+LKA	82.0	84.2	85.5	86.7
ShuffleNet	79.6	81.1	81.0	79.6
Shufflenet+LKA	79.8	83.2	84.7	84.8
RegNet	75.7	82.8	83.1	82.2
RegNet+LKA	76.0	82.5	83.5	84.5

层间重复次数	精度/%	层间重复次数	精度/%
1，2，3，4，5，9	88.3	1，2，2，3，5，11	87.2
2，4，4，6，3，5	90.1	1，2，2，4，5，10	87.5
1，3，4，5，4，7	89.3	2，3，4，5，4，6	89.7

算法	浮点运算量/GFLOPs	参数量/10⁶	P	R	F1
文献［13］算法	14.02	33.41	0.846	0.792	0.818
文献［14］算法	12.83	33.88	0.869	0.882	0.872
文献［18］算法	1.32	2.80	0.855	0.868	0.861
文献［31］算法	5.10	20.80	0.866	0.901	0.882
文献［6］算法	2.10	7.70	0.870	0.884	0.876
本文算法	0.93	4.10	0.915	0.893	0.903

算法	Dataset	MICC-F2000	COVERAGE	MICC-F600
文献［13］算法	0.482	0.642	0.574	0.703
文献［14］算法	0.531	0.742	0.626	0.791
文献［18］算法	0.582	0.751	0.631	0.795
文献［31］算法	0.580	0.754	0.615	0.788
文献［6］算法	0.551	0.709	0.602	0.724
本文算法	0.614	0.771	0.647	0.824

[1]	Zhiqiang ZHAO, Peihong MA, Xinhong HEI. Crowd counting method based on dual attention mechanism [J]. Journal of Computer Applications, 2024, 44(9): 2886-2892.
[2]	Jing QIN, Zhiguang QIN, Fali LI, Yueheng PENG. Diagnosis of major depressive disorder based on probabilistic sparse self-attention neural network [J]. Journal of Computer Applications, 2024, 44(9): 2970-2974.
[3]	Liting LI, Bei HUA, Ruozhou HE, Kuang XU. Multivariate time series prediction model based on decoupled attention mechanism [J]. Journal of Computer Applications, 2024, 44(9): 2732-2738.
[4]	Kaipeng XUE, Tao XU, Chunjie LIAO. Multimodal sentiment analysis network with self-supervision and multi-layer cross attention [J]. Journal of Computer Applications, 2024, 44(8): 2387-2392.
[5]	Pengqi GAO, Heming HUANG, Yonghong FAN. Fusion of coordinate and multi-head attention mechanisms for interactive speech emotion recognition [J]. Journal of Computer Applications, 2024, 44(8): 2400-2406.
[6]	Zhonghua LI, Yunqi BAI, Xuejin WANG, Leilei HUANG, Chujun LIN, Shiyu LIAO. Low illumination face detection based on image enhancement [J]. Journal of Computer Applications, 2024, 44(8): 2588-2594.
[7]	Shangbin MO, Wenjun WANG, Ling DONG, Shengxiang GAO, Zhengtao YU. Single-channel speech enhancement based on multi-channel information aggregation and collaborative decoding [J]. Journal of Computer Applications, 2024, 44(8): 2611-2617.
[8]	Wu XIONG, Congjun CAO, Xuefang SONG, Yunlong SHAO, Xusheng WANG. Handwriting identification method based on multi-scale mixed domain attention mechanism [J]. Journal of Computer Applications, 2024, 44(7): 2225-2232.
[9]	Huanhuan LI, Tianqiang HUANG, Xuemei DING, Haifeng LUO, Liqing HUANG. Public traffic demand prediction based on multi-scale spatial-temporal graph convolutional network [J]. Journal of Computer Applications, 2024, 44(7): 2065-2072.
[10]	Dianhui MAO, Xuebo LI, Junling LIU, Denghui ZHANG, Wenjing YAN. Chinese entity and relation extraction model based on parallel heterogeneous graph and sequential attention mechanism [J]. Journal of Computer Applications, 2024, 44(7): 2018-2025.
[11]	Li LIU, Haijin HOU, Anhong WANG, Tao ZHANG. Generative data hiding algorithm based on multi-scale attention [J]. Journal of Computer Applications, 2024, 44(7): 2102-2109.
[12]	Song XU, Wenbo ZHANG, Yifan WANG. Lightweight video salient object detection network based on spatiotemporal information [J]. Journal of Computer Applications, 2024, 44(7): 2192-2199.
[13]	Dahai LI, Zhonghua WANG, Zhendong WANG. Dual-branch low-light image enhancement network combining spatial and frequency domain information [J]. Journal of Computer Applications, 2024, 44(7): 2175-2182.
[14]	Wenliang WEI, Yangping WANG, Biao YUE, Anzheng WANG, Zhe ZHANG. Deep learning model for infrared and visible image fusion based on illumination weight allocation and attention [J]. Journal of Computer Applications, 2024, 44(7): 2183-2191.
[15]	Zexin XU, Lei YANG, Kangshun LI. Shorter long-sequence time series forecasting model [J]. Journal of Computer Applications, 2024, 44(6): 1824-1831.

通道数变化率/%	精度/%	通道数变化率/%	精度/%
-30	88.7	0	90.1
-25	90.2	+10	87.8
-10	89.8	+25	88.0

通道数变化率/%	精度/%	通道数变化率/%	精度/%
-30	88.7	0	90.1
-25	90.2	+10	87.8
-10	89.8	+25	88.0