融合注意力机制和多尺度特征的图像水印方法

doi:10.11772/j.issn.1001-9081.2024030282

《计算机应用》唯一官方网站 ›› 2025, Vol. 45 ›› Issue (2): 616-623.DOI: 10.11772/j.issn.1001-9081.2024030282

• 多媒体计算与计算机仿真 • 上一篇

融合注意力机制和多尺度特征的图像水印方法

张天骐, 谭霜(), 沈夕文, 唐娟

重庆邮电大学通信与信息工程学院，重庆 400065

收稿日期:2024-03-18 修回日期:2024-06-20 接受日期:2024-06-25 发布日期:2024-10-14 出版日期:2025-02-10
通讯作者: 谭霜
作者简介:张天骐（1971—），男，四川眉山人，教授，博士，CCF会员，主要研究方向：通信信号的调制解调、盲处理
沈夕文（2000—），男，安徽滁州人，硕士研究生，主要研究方向：语音增强、语音信号处理
唐娟（2000—），女，四川德阳人，硕士研究生，主要研究方向：卫星扩频信号捕获。
基金资助:
重庆市自然科学基金资助项目(cstc2021jcyj-msxmX0836)

Image watermarking method combining attention mechanism and multi-scale feature

Tianqi ZHANG, Shuang TAN(), Xiwen SHEN, Juan TANG

School of Communication and Information Engineering，Chongqing University of Posts and Telecommunications，Chongqing 400065，China

Received:2024-03-18 Revised:2024-06-20 Accepted:2024-06-25 Online:2024-10-14 Published:2025-02-10
Contact: Shuang TAN
About author:ZHANG Tianqi， born in 1971， Ph. D.， professor. His research interests include modulation and demodulation of communication signals， blind processing.
SHEN Xiwen， born in 2000， M. S. candidate. His research interests include speech enhancement， speech signal processing.
TANG Juan， born in 2000， M. S. candidate. Her research interests include satellite spread spectrum signal capture.
Supported by:
Natural Science Foundation of Chongqing(cstc2021jcyj-msxmX0836)

摘要/Abstract

摘要：

针对基于深度学习的水印方法未充分突显图像的关键特征，以及未有效利用中间卷积层输出特征的问题，为提升含水印图像的视觉质量和抵抗噪声攻击的能力，提出一种融合注意力机制和多尺度特征的图像水印方法。在编码器部分，设计注意力模块关注重要图像特征，以减小水印嵌入引起的图像失真；在解码器部分，设计多尺度特征提取模块，以捕获不同层次的图像细节。实验结果表明，在COCO数据集上与深度水印模型HiDDeN（Hiding Data with Deep Networks）相比，所提方法生成的含水印图像的峰值信噪比（PSNR）和结构相似度（SSIM）分别增加了11.63%和1.29%；所提方法针对dropout、cropout、crop、高斯模糊和JPEG压缩的水印提取平均误比特率（BER）降低了53.85%；此外，消融实验结果验证了添加注意力模块和多尺度特征提取模块的方法有更好的不可见性和鲁棒性。

关键词: 图像水印, 注意力机制, 特征提取, 鲁棒水印, 深度学习, 对抗训练

Abstract:

Aiming at the problems that the watermarking method based on deep learning does not fully highlight key features of the image and does not utilize the output features of the intermediate convolution layer effectively， to improve the visual quality and the ability to resist noise attacks of the watermarked image， an attention mechanism-based multi-scale feature image watermarking method was proposed. An attention module was designed in the encoder part to focus on important image features， thereby reducing image distortion caused by watermark embedding； a multi-scale feature extraction module was designed in the decoder part to capture different levels of image details. Experimental results show that compared with the deep watermark model HiDDeN（Hiding Data with Deep Networks） on COCO dataset， the proposed method has the generated watermarked image’s Peak Signal-to-Noise Ratio （PSNR） and Structural SIMilarity （SSIM） increased by 11.63% and 1.29% respectively and has the average Bit Error Rate （BER） of watermark extraction for dropout， cropout， crop， Gaussian blur， and JPEG compression reduced by 53.85%. In addition， ablation experimental results confirm that the method adding attention module and multi-scale feature extraction module has better invisibility and robustness.

Key words: image watermarking, attention mechanism, feature extraction, robust watermarking, deep learning, adversarial training

中图分类号:

TP309.7

张天骐, 谭霜, 沈夕文, 唐娟. 融合注意力机制和多尺度特征的图像水印方法[J]. 计算机应用, 2025, 45(2): 616-623.

Tianqi ZHANG, Shuang TAN, Xiwen SHEN, Juan TANG. Image watermarking method combining attention mechanism and multi-scale feature[J]. Journal of Computer Applications, 2025, 45(2): 616-623.

图/表 14

图1 本文模型的结构

Fig. 1 Structure of proposed model

图2 编码器的结构

Fig. 2 Structure of encoder

表1 噪声层种类及描述

Tab. 1 Noise layer types and descriptions

噪声种类	噪声描述
缩放	调整 $I e m$ 的尺寸，将它缩小或放大 $r$ 倍。若 $r < 1$ ，图像尺寸缩小；反之，则放大
dropout	$I e m$ 中的每个像素点在概率 $p d ∈ (0,1)$ 下保留，在概率 $1 - p d$ 下被 $I c o$ 对应位置的像素点替换。若 $p d$ 越大，图像像素丢失越少
高斯模糊	使用高斯核对 $I e m$ 中的每个像素点周围重新分配权重以平滑图像，高斯模糊攻击的强度由高斯核的标准差 $σ$ 决定。 $σ$ 越大，处理后的图像越平滑
JPEG压缩	将 $I e m$ 划分为若干个8×8的小块，并对每个小块作DCT得到频域系数，然后对频域系数进行量化。 JPEG压缩攻击强度由压缩质量参数 $q ∈ (50,100)$ 决定， $q$ 越大，图像细节保留得越好，图像质量越高
椒盐噪声	$I e m$ 中的每个像素点在概率 $p s ∈ (0,1)$ 下随机替换为黑色或白色的像素点，在概率 $1 - 2 p s$ 下被保留

表1 噪声层种类及描述

Tab. 1 Noise layer types and descriptions

噪声种类	噪声描述
缩放	调整 $I e m$ 的尺寸，将它缩小或放大 $r$ 倍。若 $r < 1$ ，图像尺寸缩小；反之，则放大
dropout	$I e m$ 中的每个像素点在概率 $p d ∈ (0,1)$ 下保留，在概率 $1 - p d$ 下被 $I c o$ 对应位置的像素点替换。若 $p d$ 越大，图像像素丢失越少
高斯模糊	使用高斯核对 $I e m$ 中的每个像素点周围重新分配权重以平滑图像，高斯模糊攻击的强度由高斯核的标准差 $σ$ 决定。 $σ$ 越大，处理后的图像越平滑
JPEG压缩	将 $I e m$ 划分为若干个8×8的小块，并对每个小块作DCT得到频域系数，然后对频域系数进行量化。 JPEG压缩攻击强度由压缩质量参数 $q ∈ (50,100)$ 决定， $q$ 越大，图像细节保留得越好，图像质量越高
椒盐噪声	$I e m$ 中的每个像素点在概率 $p s ∈ (0,1)$ 下随机替换为黑色或白色的像素点，在概率 $1 - 2 p s$ 下被保留

图3 噪声层效果

Fig. 3 Noise layer effect

图4 解码器的结构

Fig. 4 Structure of decoder

图5 鉴别器的结构

Fig. 5 Structure of discriminator

图6 不同方法的图像不可见性主观效果

Fig. 6 Subjective effect of image invisibility of different methods

图7 引入注意力模块前后的热力图

Fig. 7 Heat maps before and after introducing attention module

表2 不同方法生成水印图像的PSNR和SSIM

Tab. 2 PSNR and SSIM of watermarked images generated by different methods

方法	PSNR/dB	SSIM/%
HiDDeN-NN	35.61	98.63
本文方法-NN	41.09	99.65
HiDDeN	30.88	96.65
本文方法	34.47	97.90

图8 不同攻击强度下不同方法的误比特率

Fig. 8 BERs of different methods with different attack strengths

表3 不同方法在COCO数据集上的性能对比

Tab. 3 Performance comparison of different methods on COCO dataset

方法	不可见性		鲁棒性（不同噪声攻击下的BER）						参数量/10⁶
方法	PSNR/dB	SSIM/%	dropout（ $p d = 0.3$ ）	cropout（p=0.3）	crop（p=0.035）	高斯模糊（σ=2）	JPEG压缩（q=80）	平均	参数量/10⁶
HiDDeN	30.88	96.65	0.07	0.06	0.12	0.04	0.37	0.13	0.45
ReDMark	35.93	96.60	0.08	0.08	0.12	0.50	0.25	0.21	0.13
IGA	—	—	0.22	0.13	0.26	0.19	0.13	0.19	—
SSLW	33.50	84.12	0.12	0.49	0.20	0.01	0.17	0.20	27.70
ARWGAN	35.87	96.88	0.04	0.04	0.04	0.03	0.14	0.06	1.50
本文方法	35.92	98.14	0.04	0.02	0.02	0.03	0.17	0.06	0.55

表3 不同方法在COCO数据集上的性能对比

Tab. 3 Performance comparison of different methods on COCO dataset

方法	不可见性		鲁棒性（不同噪声攻击下的BER）						参数量/10⁶
方法	PSNR/dB	SSIM/%	dropout（ $p d = 0.3$ ）	cropout（p=0.3）	crop（p=0.035）	高斯模糊（σ=2）	JPEG压缩（q=80）	平均	参数量/10⁶
HiDDeN	30.88	96.65	0.07	0.06	0.12	0.04	0.37	0.13	0.45
ReDMark	35.93	96.60	0.08	0.08	0.12	0.50	0.25	0.21	0.13
IGA	—	—	0.22	0.13	0.26	0.19	0.13	0.19	—
SSLW	33.50	84.12	0.12	0.49	0.20	0.01	0.17	0.20	27.70
ARWGAN	35.87	96.88	0.04	0.04	0.04	0.03	0.14	0.06	1.50
本文方法	35.92	98.14	0.04	0.02	0.02	0.03	0.17	0.06	0.55

图9 不同数据集的图像及嵌入水印后的版本

Fig. 9 Images from different datasets and their watermarked versions

表4 本文方法在不同数据集上的结果对比

Tab. 4 Comparison of results of proposed method on different datasets

数据集	不可见性		鲁棒性（不同噪声攻击下的BER）
数据集	PSNR/dB	SSIM/%	缩放（r=0.8）	Dropout（ $p d = 0.3$ ）	高斯模糊（σ=2）	JPEG压缩（q=80）	椒盐噪声（ $p s = 0.1$ ）	平均
COCO	34.47	97.90	0.01	0.07	0.03	0.04	0.07	0.04
ImageNet	34.88	97.75	0.02	0.08	0.03	0.05	0.07	0.05
VOC 2012	35.10	97.83	0.02	0.08	0.03	0.05	0.07	0.05
NaSC TG2	37.21	99.52	0.03	0.08	0.03	0.06	0.07	0.05
Animal	35.74	97.89	0.02	0.07	0.02	0.06	0.07	0.05
Intel	34.56	98.24	0.03	0.09	0.03	0.03	0.07	0.05

表4 本文方法在不同数据集上的结果对比

Tab. 4 Comparison of results of proposed method on different datasets

数据集	不可见性		鲁棒性（不同噪声攻击下的BER）
数据集	PSNR/dB	SSIM/%	缩放（r=0.8）	Dropout（ $p d = 0.3$ ）	高斯模糊（σ=2）	JPEG压缩（q=80）	椒盐噪声（ $p s = 0.1$ ）	平均
COCO	34.47	97.90	0.01	0.07	0.03	0.04	0.07	0.04
ImageNet	34.88	97.75	0.02	0.08	0.03	0.05	0.07	0.05
VOC 2012	35.10	97.83	0.02	0.08	0.03	0.05	0.07	0.05
NaSC TG2	37.21	99.52	0.03	0.08	0.03	0.06	0.07	0.05
Animal	35.74	97.89	0.02	0.07	0.02	0.06	0.07	0.05
Intel	34.56	98.24	0.03	0.09	0.03	0.03	0.07	0.05

表5 消融实验结果对比

Tab. 5 Comparison of ablation experimental results

方法	不可见性		鲁棒性（不同噪声攻击下的BER）
方法	PSNR/dB	SSIM/%	缩放（r=0.8）	dropout（ $p d = 0.3$ ）	高斯模糊（σ=2）	JPEG压缩（q=80）	椒盐噪声（ $p s = 0.1$ ）	平均
w/o am	33.42	97.77	0.19	0.07	0.07	0.09	0.05	0.09
w/o mf	30.79	96.76	0.08	0.10	0.14	0.09	0.14	0.11
本文方法	34.47	97.90	0.01	0.07	0.03	0.04	0.07	0.04

表5 消融实验结果对比

Tab. 5 Comparison of ablation experimental results

方法	不可见性		鲁棒性（不同噪声攻击下的BER）
方法	PSNR/dB	SSIM/%	缩放（r=0.8）	dropout（ $p d = 0.3$ ）	高斯模糊（σ=2）	JPEG压缩（q=80）	椒盐噪声（ $p s = 0.1$ ）	平均
w/o am	33.42	97.77	0.19	0.07	0.07	0.09	0.05	0.09
w/o mf	30.79	96.76	0.08	0.10	0.14	0.09	0.14	0.11
本文方法	34.47	97.90	0.01	0.07	0.03	0.04	0.07	0.04

参考文献 29

1	KASHYAP N， SINHA G R. Image watermarking using 3-level Discrete Wavelet Transform （DWT）［J］. International Journal of Modern Education and Computer Science， 2012， 4（3）： 50-56.
2	ABRAHAM J， PAUL V. An imperceptible spatial domain color image watermarking scheme［J］. Journal of King Saud University —Computer and Information Sciences， 2019， 31（1）： 125-133.
3	ALI M， AHN C W， PANT M. A robust image watermarking technique using SVD and differential evolution in DCT domain［J］. Optik， 2014， 125（1）： 428-434.
4	LU S P， WANG R， ZHONG T， et al. Large-capacity image steganography based on invertible neural networks［C］// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2021： 10811-10820.
5	LU J， NI J， SU W， et al. Wavelet-based CNN for robust and high-capacity image watermarking［C］// Proceedings of the 2022 IEEE International Conference on Multimedia and Expo. Piscataway： IEEE， 2022： 1-6.
6	PLATA M， SYGA P. Robust spatial-spread deep neural image watermarking［C］// Proceedings of the IEEE 19th International Conference on Trust， Security and Privacy in Computing and Communications. Piscataway： IEEE， 2020： 62-70.
7	TEOH Y J， LING H C， WONG W K， et al. A hybrid SVD-based image watermarking scheme utilizing both U and V orthogonal vectors for robustness and imperceptibility［J］. IEEE Access， 2023， 11： 51018-51031.
8	钟瑞泽，谢海波. 基于视觉显著性与量化指数调制的图像鲁棒水印算法［J］. 电子测量与仪器学报， 2020， 34（3）： 17-27.
	ZHONG R Z， XIE H B. Robust image watermarking algorithm based on visual saliency and quantization exponential modulation［J］. Journal of Electronic Measurement and Instrumentation， 2020， 34（3）： 17-27.
9	YUAN Z， LIU D， ZHANG X， et al. DCT-based color digital image blind watermarking method with variable steps［J］. Multimedia Tools and Applications， 2020， 79（41/42）： 30557-30581.
10	张天骐，周琳，梁先明，等. 基于Blob-Harris特征区域和NSCT-Zernike的鲁棒水印算法［J］. 电子与信息学报， 2021， 43（7）： 2038-2045.
	ZHANG T Q， ZHOU L， LIANG X M， et al. A robust watermarking algorithm based on Blob-Harris and NSCT-Zernike［J］. Journal of Electronics and Information Technology， 2021， 43（7）： 2038-2045.
11	FANG H， JIA Z， MA Z， et al. PIMoG： an effective screen-shooting noise-layer simulation for deep-learning-based watermarking network［C］// Proceedings of the 30th ACM International Conference on Multimedia. New York： ACM， 2022： 2267-2275.
12	MAHAPATRA D， AMRIT P， SINGH O P， et al. Autoencoder-convolutional neural network-based embedding and extraction model for image watermarking［J］. Journal of Electronic Imaging， 2023， 32（2）： No.021604.
13	WANG X， MA D， HU K， et al. Mapping based residual convolution neural network for non-embedding and blind image watermarking［J］. Journal of Information Security and Applications， 2021， 59： No.102820.
14	BALUJA S. Hiding images in plain sight： deep steganography［C］// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2017： 2066-2076.
15	ZHU J， KAPLAN R， JOHNSON J， et al. HiDDeN： hiding data with deep networks［C］// Proceedings of the 2018 European Conference on Computer Vision， LNCS 11219. Cham： Springer， 2018： 682-697.
16	HAO K， FENG G， ZHANG X. Robust image watermarking based on generative adversarial network［J］. China Communications， 2020， 17（11）： 131-140.
17	ZHAO Z， LI J， LUO Z， et al. Remote sensing image scene classification based on an enhanced attention module［J］. IEEE Geoscience and Remote Sensing Letters， 2021， 18（11）： 1926-1930.
18	FU J， LIU J， JIANG J， et al. Scene segmentation with dual relation-aware attention network［J］. IEEE Transactions on Neural Networks and Learning Systems， 2021， 32（6）： 2547-2560.
19	YAN C， HAO Y， LI L， et al. Task-adaptive attention for image captioning［J］. IEEE Transactions on Circuits and Systems for Video Technology， 2022， 32（1）： 43-51.
20	CHEN B， TAN W， COATRIEUX G， et al. A serial image copy-move forgery localization scheme with source/target distinguishment［J］. IEEE Transactions on Multimedia， 2021， 23： 3506-3517.
21	ZHANG H， LI Y. Digital watermarking via inverse gradient attention［C］// Proceedings of the 9th International Conference on Behavioural and Social Computing. Piscataway： IEEE， 2022： 1-3.
22	YU C. Attention based data hiding with generative adversarial networks［C］// Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto： AAAI Press， 2020： 1120-1128.
23	LIN T Y， MAIRE M， BELONGIE S， et al. Microsoft COCO： common objects in context［C］// Proceedings of the 2014 European Conference on Computer Vision， LNCS 8693. Cham： Springer， 2014： 740-755.
24	AHMADI M， NOROUZI A， KARIMI N， et al. ReDMark： framework for residual diffusion watermarking based on deep networks［J］. Expert Systems with Applications， 2020， 146： No.113157.
25	FERNANDEZ P， SABLAYROLLES A， FURON T， et al. Watermarking images in self-supervised latent spaces［C］// Proceedings of the 2022 IEEE International Conference on Acoustics， Speech and Signal Processing. Piscataway： IEEE， 2022： 3054-3058.
26	HUANG J， LUO T， LI L， et al. ARWGAN： attention-guided robust image watermarking model based on GAN［J］. IEEE Transactions on Instrumentation and Measurement， 2023， 72： No.5018417.
27	DENG J， DONG W， SOCHER R， et al. ImageNet： a large-scale hierarchical image database［C］// Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2009： 248-255.
28	EVERINGHAM M， VAN GOOL L， WILLIAMS C K I， et al. The PASCAL Visual Object Classes （VOC） challenge［J］. International Journal of Computer Vision， 2010， 88（2）： 303-338.
29	ZHOU Z， LI S， WU W， et al. NaSC-TG2： natural scene classification with Tiangong-2 remotely sensed imagery［J］. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing， 2021， 14： 3228-3242.

[1]	邓淼磊, 阚雨培, 孙川川, 徐海航, 樊少珺, 周鑫. 基于深度学习的网络入侵检测系统综述[J]. 《计算机应用》唯一官方网站, 2025, 45(2): 453-466.
[2]	蔡启健, 谭伟. 语义图增强的多模态推荐算法[J]. 《计算机应用》唯一官方网站, 2025, 45(2): 421-427.
[3]	洪梓榕, 包广清. 基于集成学习的雷达自动目标识别综述[J]. 《计算机应用》唯一官方网站, 2025, 45(2): 371-382.
[4]	孟海腾, 赵小乐, 李天瑞. 基于非对称信息蒸馏网络的轻量级图像超分辨重建[J]. 《计算机应用》唯一官方网站, 2025, 45(2): 601-609.
[5]	王地欣, 王佳昊, 李敏, 陈浩, 胡光耀, 龚宇. 面向水声通信网络的异常攻击检测[J]. 《计算机应用》唯一官方网站, 2025, 45(2): 526-533.
[6]	张嘉琳, 任庆桦, 毛启容. 利用全局-局部特征依赖的反欺骗说话人验证系统[J]. 《计算机应用》唯一官方网站, 2025, 45(1): 308-317.
[7]	黄颖, 李昌盛, 彭慧, 刘苏. 用于动态场景高动态范围成像的局部熵引导的双分支网络[J]. 《计算机应用》唯一官方网站, 2025, 45(1): 204-213.
[8]	王丽芳, 吴荆双, 尹鹏亮, 胡立华. 基于注意力机制和能量函数的动作识别算法[J]. 《计算机应用》唯一官方网站, 2025, 45(1): 234-239.
[9]	宋鹏程, 郭立君, 张荣. 利用局部-全局时间依赖的弱监督视频异常检测[J]. 《计算机应用》唯一官方网站, 2025, 45(1): 240-246.
[10]	徐杰, 钟勇, 王阳, 张昌福, 杨观赐. 基于上下文通道注意力机制的人脸属性估计与表情识别[J]. 《计算机应用》唯一官方网站, 2025, 45(1): 253-260.
[11]	陈俊颖, 郭士杰, 陈玲玲. 基于解耦注意力与幻影卷积的轻量级人体姿态估计[J]. 《计算机应用》唯一官方网站, 2025, 45(1): 223-233.
[12]	张思齐, 张金俊, 王天一, 秦小林. 基于信号时态逻辑的深度时序事件检测算法[J]. 《计算机应用》唯一官方网站, 2025, 45(1): 90-97.
[13]	郑宗生, 杜嘉, 成雨荷, 赵泽骋, 张月维, 王绪龙. 用于红外-可见光图像分类的跨模态双流交替交互网络[J]. 《计算机应用》唯一官方网站, 2025, 45(1): 275-283.
[14]	徐欣然, 张绍兵, 成苗, 张洋, 曾尚. 基于多路层次化混合专家模型的轴承故障诊断方法[J]. 《计算机应用》唯一官方网站, 2025, 45(1): 59-68.
[15]	梁杰涛, 罗兵, 付兰慧, 常青玲, 李楠楠, 易宁波, 冯其, 何鑫, 邓辅秦. 基于坐标几何采样的点云配准方法[J]. 《计算机应用》唯一官方网站, 2025, 45(1): 214-222.

融合注意力机制和多尺度特征的图像水印方法

Image watermarking method combining attention mechanism and multi-scale feature

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 14

参考文献 29

相关文章 15

编辑推荐

Metrics