级联融合与增强重建的多聚焦图像融合网络

doi:10.11772/j.issn.1001-9081.2024030302

《计算机应用》唯一官方网站 ›› 2025, Vol. 45 ›› Issue (2): 594-600.DOI: 10.11772/j.issn.1001-9081.2024030302

• 多媒体计算与计算机仿真 • 上一篇

级联融合与增强重建的多聚焦图像融合网络

杨本臣(), 李浩然, 金海波

辽宁工程技术大学软件学院，辽宁葫芦岛 125105

收稿日期:2024-03-21 修回日期:2024-06-10 接受日期:2024-06-13 发布日期:2024-07-31 出版日期:2025-02-10
通讯作者: 杨本臣
作者简介:李浩然（2000—），男，辽宁铁岭人，硕士研究生，CCF会员，主要研究方向：多聚焦图像融合、图像超分辨率重建、图像增强
金海波（1983—），男，辽宁沈阳人，副教授，博士，CCF会员，主要研究方向：深度学习、计算机视觉、复杂系统可靠性分析。
基金资助:
国家自然科学基金资助项目(62173171)

Multi-focus image fusion network with cascade fusion and enhanced reconstruction

Benchen YANG(), Haoran LI, Haibo JIN

School of Software，Liaoning Technical University，Huludao Liaoning 125105，China

Received:2024-03-21 Revised:2024-06-10 Accepted:2024-06-13 Online:2024-07-31 Published:2025-02-10
Contact: Benchen YANG
About author:LI Haoran， born in 2000， M. S. candidate. His research interests include multi-focus image fusion， image super-resolution reconstruction， image enhancement.
JIN Haibo， born in 1983， Ph. D.， associate professor. His research interests include deep learning， computer vision， reliability analysis of complex systems.
Supported by:
National Natural Science Foundation of China(62173171)

摘要/Abstract

摘要：

针对数字图像拍摄过程中因远近视野聚焦不当所导致的半聚焦图像问题，提出一种级联融合与增强重建的多聚焦图像融合网络（CasNet）。首先，构建级联采样模块对不同深度采样特征图的残差进行计算与合并，从而高效利用不同尺度下的聚焦特征；其次，改进轻量化多头自注意力机制以计算特征图的维度残差，从而完成图像的特征增强，并使特征图在不同维度上呈现更优分布；再次，使用卷积通道注意力堆叠完成特征重建；最后，在采样过程中使用分隔卷积进行上下采样，从而保留更多的图像原有特征。实验结果表明，在多聚焦图像基准测试集Lytro、MFFW、grayscale和MFI-WHU上，CasNet相较于SESF-Fuse（Spatially Enhanced Spatial Frequency-based Fusion）和U2Fusion（Unified Unsupervised Fusion network）等热门方法在平均梯度（AG）、灰度级差（GLD）等指标上都取得了较好的结果。

关键词: 多聚焦图像融合, 深度神经网络, 特征重建, 特征增强, 注意力

Abstract:

Aiming at the problem of semi-focus images caused by improper focusing of far and near visual fields during digital image shooting， a multi-focus image fusion Network with Cascade fusion and enhanced reconstruction （CasNet） was proposed. Firstly， a cascade sampling module was constructed to calculate and merge the residuals of feature maps sampled at different depths for efficient utilization of focused features at different scales. Secondly， a lightweight multi-head self-attention mechanism was improved to perform dimensional residual calculation on feature maps for feature enhancement of the image and make the feature maps present better distribution in different dimensions. Thirdly， convolution channel attention stacking was used to complete feature reconstruction. Finally， interval convolution was used for up- and down-sampling during the sampling process， so as to retain more original image features. Experimental results demonstrate that CasNet achieves better results in metrics such as Average Gradient （AG） and Gray-Level Difference （GLD） on multi-focus image benchmark test sets Lytro， MFFW， grayscale， and MFI-WHU compared to popular methods such as SESF-Fuse （Spatially Enhanced Spatial Frequency-based Fusion） and U2Fusion （Unified Unsupervised Fusion network）.

Key words: multi-focus image fusion, Deep Neural Network (DNN), feature reconstruction, feature enhancement, attention

中图分类号:

TP391.4

杨本臣, 李浩然, 金海波. 级联融合与增强重建的多聚焦图像融合网络[J]. 计算机应用, 2025, 45(2): 594-600.

Benchen YANG, Haoran LI, Haibo JIN. Multi-focus image fusion network with cascade fusion and enhanced reconstruction[J]. Journal of Computer Applications, 2025, 45(2): 594-600.

图/表 10

参考文献 34

1	LIU Y， WANG L， CHENG J， et al. Multi-focus image fusion： a survey of the state of the art［J］. Information Fusion， 2020， 64： 71-91.
2	ZHOU Y， YU L， ZHI C， et al. A survey of multi-focus image fusion methods［J］. Applied Sciences， 2022， 12（12）： No.6281.
3	LI H， MANJUNATH B S， MITRA S K. Multisensor image fusion using the wavelet transform［J］. Graphical Models and Image Processing， 1995， 57（3）： 235-245.
4	LIU Y， LIU S， WANG Z. Multi-focus image fusion with dense SIFT［J］. Information Fusion， 2015， 23： 139-155.
5	LIU Y， CHEN X， PENG H， et al. Multi-focus image fusion with a deep convolutional neural network［J］. Information Fusion， 2017， 36： 191-207.
6	ZHAO W， WANG D， LU H. Multi-focus image fusion with a natural enhancement via a joint multi-level deeply supervised convolutional neural network［J］. IEEE Transactions on Circuits and Systems for Video Technology， 2019， 29（4）： 1102-1115.
7	ZHANG H， LE Z， SHAO Z， et al. MFF-GAN： an unsupervised generative adversarial network with adaptive and gradient joint constraints for multi-focus image fusion［J］. Information Fusion， 2021， 66： 40-53.
8	GOODFELLOW I J， POUGET-ABADIE J， MIRZA M， et al. Generative adversarial nets［C］// Proceedings of the 27th International Conference on Neural Information Processing Systems. Cambridge： MIT Press， 2014： 2672-2680.
9	ZHANG X. Deep learning-based multi-focus image fusion： a survey and a comparative study［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2022， 44（9）： 4819-4838.
10	ZHANG Q， GUO B L. Multifocus image fusion using the nonsubsampled contourlet transform［J］. Signal Processing， 2009， 89（7）： 1334-1346.
11	WANG W， CHANG F. A multi-focus image fusion method based on Laplacian pyramid［J］. Journal of Computers， 2011， 6（12）： 2559-2566.
12	CUI G， FENG H， XU Z， et al. Detail preserved fusion of visible and infrared images using regional saliency extraction and multi-scale image decomposition［J］. Optics Communications， 2015， 341： 199-209.
13	LI S， KANG X， HU J. Image fusion with guided filtering［J］. IEEE Transactions on Image Processing， 2013， 22（7）： 2864-2875.
14	ZHAN K， KONG L， LIU B， et al. Multimodal image seamless fusion［J］. Journal of Electronic Imaging， 2019， 28（2）： No.023027.
15	MA B， ZHU Y， YIN X， et al. SESF-Fuse： an unsupervised deep model for multi-focus image fusion［J］. Neural Computing and Applications， 2021， 33（11）： 5793-5804.
16	ZHANG Y， LIU Y， SUN P， et al. IFCNN： a general image fusion framework based on convolutional neural network［J］. Information Fusion， 2020， 54： 99-118.
17	XU H， MA J， JIANG J， et al. U2Fusion： a unified unsupervised image fusion network［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2022， 44（1）： 502-518.
18	WANG Z， CHEN J， HOI S C H. Deep learning for image super-resolution： a survey［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2021， 43（10）： 3365-3387.
19	DONG C， LOY C C， HE K， et al. Image super-resolution using deep convolutional networks［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2016， 38（2）： 295-307.
20	KIM J， LEE J K， LEE K M. Accurate image super-resolution using very deep convolutional networks［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 1646-1654.
21	HE K， ZHANG X， REN S， et al. Deep residual learning for image recognition［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 770-778.
22	LIM B， SON S， KIM H， et al. Enhanced deep residual networks for single image super-resolution［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 1132-1140.
23	GENDY G， HE G， SABOR N. Lightweight image super-resolution based on deep learning： state-of-the-art and future directions［J］. Information Fusion， 2023， 94： 284-310.
24	ZAMIR S W， ARORA A， KHAN S， et al. Restormer： efficient Transformer for high-resolution image restoration［C］// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2022： 5718-5729.
25	LU Z， LI J， LIU H， et al. Transformer for single image super-resolution［C］// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Piscataway： IEEE， 2022： 456-465.
26	KINGAD D P， BA J L. Adam： a method for stochastic optimization［EB/OL］. ［2023-11-03］..
27	EVERINGHAM M， ESLAMI S M A， VAN GOOL L， et al. The PASCAL visual object classes challenge： a retrospective［J］. International Journal of Computer Vision， 2015， 111（1）： 98-136.
28	WANG Y， XU S， LIU J， et al. MFIF-GAN： a new generative adversarial network for multi-focus image fusion［J］. Signal Processing： Image Communication， 2021， 96： No.116295.
29	NEJATI M， SAMAVI S， SHIRANI S. Multi-focus image fusion using dictionary-based sparse representation［J］. Information Fusion， 2015， 25： 72-84.
30	XU S， WEI X， ZHANG C， et al. MFFW： a new dataset for multi-focus image fusion［EB/OL］. ［2023-02-12］..
31	JADON A， PATIL A， JADON S. A comprehensive survey of regression based loss functions for time series forecasting［C］// Proceedings of the 2024 International Conference on Data Management， Analytics and Innovation， LNNS 998. Cham： Springer， 2024： 117-147.
32	WANG Z， BOVIK A C， SHEIKH H R， et al. Image quality assessment： from error visibility to structural similarity［J］. IEEE Transactions on Image Processing， 2004， 13（4）： 600-612.
33	MA H， LIAO Q， ZHANG J， et al. An α-matte boundary defocus model-based cascaded network for multi-focus image fusion［J］. IEEE Transactions on Image Processing， 2020， 29： 8668-8679.
34	BAI X， ZHOU F， XUE B. Noise-suppressed image enhancement using multiscale top-hat selection transform through region extraction［J］. Applied Optics， 2012， 51（3）： 338-347.

数据集	方法	AG	GLD	MSD	LIF
Lytro	DWT	2.892 015	14.317 518	0.110 454	0.410 464
	GF	2.886 802	14.287 715	0.110 801	0.407 880
	DSIFT	2.890 632	14.306 651	0.110 830	0.407 591
	NSCT	2.876 968	14.244 632	0.110 496	0.410 090
	IFCNN	2.891 815	14.316 606	0.110 749	0.407 939
	SESF-Fuse	2.888 225	14.294 464	0.110 864	0.407 565
	U2Fusion	2.286 115	11.307 678	0.099 932	0.427 310
	MFF-GAN	2.726 301	13.505 882	0.106 499	0.411 883
	CasNet	3.008 855	14.905 202	0.111 582	0.406 753
MFFW	DWT	3.497 837	17.372 400	0.119 985	0.414 033
	GF	3.446 860	17.131 368	0.118 574	0.413 859
	DSIFT	3.458 350	17.185 319	0.118 444	0.412 197
	NSCT	3.429 046	17.044 636	0.120 315	0.414 269
	IFCNN	3.421 395	17.005 448	0.121 998	0.410 692
	SESF-Fuse	3.508 006	17.433 581	0.119 363	0.407 991
	U2Fusion	2.633 200	13.077 647	0.108 532	0.431 687
	MFF-GAN	3.365 119	16.768 212	0.115 540	0.404 708
	CasNet	3.622 536	18.007 483	0.123 359	0.415 832
grayscale	DWT	3.920 240	19.336 019	0.154 317	0.457 913
	GF	3.828 959	18.869 385	0.153 370	0.457 169
	DSIFT	3.848 095	18.965 958	0.153 655	0.456 716
	NSCT	3.834 118	18.900 831	0.154 379	0.457 600
	IFCNN	3.869 484	19.087 379	0.154 687	0.456 762
	SESF-Fuse	3.830 764	18.880 392	0.153 642	0.456 556
	U2Fusion	2.951 674	14.522 842	0.140 666	0.472 399
	MFF-GAN	3.870 514	19.118 118	0.149 254	0.459 194
	CasNet	4.137 110	20.435 776	0.158 180	0.451 321
MFI-WHU	DWT	4.147 645	20.625 134	0.100 404	0.482 206
	GF	4.131 233	20.532 169	0.100 373	0.482 510
	DSIFT	4.099 983	20.403 363	0.100 261	0.483 256
	NSCT	4.141 690	20.589 193	0.100 430	0.482 193
	IFCNN	4.110 629	20.464 973	0.100 824	0.480 513
	SESF-Fuse	4.098 331	20.392 291	0.100 376	0.482 165
	U2Fusion	2.906 110	14.379 773	0.091 892	0.500 723
	MFF-GAN	3.883 448	19.301 371	0.098 425	0.479 748
	CasNet	4.422 203	21.971 969	0.101 070	0.480 715

数据集	方法	AG	GLD	MSD	LIF
Lytro	DWT	2.892 015	14.317 518	0.110 454	0.410 464
	GF	2.886 802	14.287 715	0.110 801	0.407 880
	DSIFT	2.890 632	14.306 651	0.110 830	0.407 591
	NSCT	2.876 968	14.244 632	0.110 496	0.410 090
	IFCNN	2.891 815	14.316 606	0.110 749	0.407 939
	SESF-Fuse	2.888 225	14.294 464	0.110 864	0.407 565
	U2Fusion	2.286 115	11.307 678	0.099 932	0.427 310
	MFF-GAN	2.726 301	13.505 882	0.106 499	0.411 883
	CasNet	3.008 855	14.905 202	0.111 582	0.406 753
MFFW	DWT	3.497 837	17.372 400	0.119 985	0.414 033
	GF	3.446 860	17.131 368	0.118 574	0.413 859
	DSIFT	3.458 350	17.185 319	0.118 444	0.412 197
	NSCT	3.429 046	17.044 636	0.120 315	0.414 269
	IFCNN	3.421 395	17.005 448	0.121 998	0.410 692
	SESF-Fuse	3.508 006	17.433 581	0.119 363	0.407 991
	U2Fusion	2.633 200	13.077 647	0.108 532	0.431 687
	MFF-GAN	3.365 119	16.768 212	0.115 540	0.404 708
	CasNet	3.622 536	18.007 483	0.123 359	0.415 832
grayscale	DWT	3.920 240	19.336 019	0.154 317	0.457 913
	GF	3.828 959	18.869 385	0.153 370	0.457 169
	DSIFT	3.848 095	18.965 958	0.153 655	0.456 716
	NSCT	3.834 118	18.900 831	0.154 379	0.457 600
	IFCNN	3.869 484	19.087 379	0.154 687	0.456 762
	SESF-Fuse	3.830 764	18.880 392	0.153 642	0.456 556
	U2Fusion	2.951 674	14.522 842	0.140 666	0.472 399
	MFF-GAN	3.870 514	19.118 118	0.149 254	0.459 194
	CasNet	4.137 110	20.435 776	0.158 180	0.451 321
MFI-WHU	DWT	4.147 645	20.625 134	0.100 404	0.482 206
	GF	4.131 233	20.532 169	0.100 373	0.482 510
	DSIFT	4.099 983	20.403 363	0.100 261	0.483 256
	NSCT	4.141 690	20.589 193	0.100 430	0.482 193
	IFCNN	4.110 629	20.464 973	0.100 824	0.480 513
	SESF-Fuse	4.098 331	20.392 291	0.100 376	0.482 165
	U2Fusion	2.906 110	14.379 773	0.091 892	0.500 723
	MFF-GAN	3.883 448	19.301 371	0.098 425	0.479 748
	CasNet	4.422 203	21.971 969	0.101 070	0.480 715

编号	采样方式	CSL	FEL	FRL	AG	GLD	MSD	LIF
1	分隔卷积	√	√	√	3.008 855	14.905 202	0.111 582	0.406 753
2	卷积池化	√	√	√	2.836 003	14.032 692	0.112 648	0.408 859
3	分隔卷积	√	×	√	2.881 604	14.299 073	0.110 952	0.407 592
4	分隔卷积	√	√	×	2.885 072	14.279 921	0.111 756	0.409 389
5	分隔卷积	√	×	×	2.757 684	13.642 471	0.109 382	0.413 323

编号	采样方式	CSL	FEL	FRL	AG	GLD	MSD	LIF
1	分隔卷积	√	√	√	3.008 855	14.905 202	0.111 582	0.406 753
2	卷积池化	√	√	√	2.836 003	14.032 692	0.112 648	0.408 859
3	分隔卷积	√	×	√	2.881 604	14.299 073	0.110 952	0.407 592
4	分隔卷积	√	√	×	2.885 072	14.279 921	0.111 756	0.409 389
5	分隔卷积	√	×	×	2.757 684	13.642 471	0.109 382	0.413 323

维度	AG	GLD	MSD	LIF
32	2.961 190	14.348 444	0.111 700	0.408 259
64	3.008 855	14.905 202	0.111 582	0.406 753
128	3.033 247	15.032 255	0.111 764	0.406 228

级联融合与增强重建的多聚焦图像融合网络

Multi-focus image fusion network with cascade fusion and enhanced reconstruction

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 10

参考文献 34

相关文章 15

编辑推荐

Metrics

[1]	蔡启健, 谭伟. 语义图增强的多模态推荐算法[J]. 《计算机应用》唯一官方网站, 2025, 45(2): 421-427.
[2]	谢冬梅, 边昕烨, 于连飞, 刘文博, 王子灵, 曲志坚, 于家峰. 基于图编码与改进流注意力的编码sORFs预测方法DeepsORF[J]. 《计算机应用》唯一官方网站, 2025, 45(2): 546-555.
[3]	杨晟, 李岩. 面向目标检测的对比知识蒸馏方法[J]. 《计算机应用》唯一官方网站, 2025, 45(2): 354-361.
[4]	王地欣, 王佳昊, 李敏, 陈浩, 胡光耀, 龚宇. 面向水声通信网络的异常攻击检测[J]. 《计算机应用》唯一官方网站, 2025, 45(2): 526-533.
[5]	谢斌红, 高婉银, 陆望东, 张英俊, 张睿. 小样本相似性匹配特征增强的密集目标计数网络[J]. 《计算机应用》唯一官方网站, 2025, 45(2): 403-410.
[6]	朱亮, 慕京哲, 左洪强, 谷晶中, 朱付保. 基于联邦图神经网络的位置隐私保护推荐方案[J]. 《计算机应用》唯一官方网站, 2025, 45(1): 136-143.
[7]	黄颖, 李昌盛, 彭慧, 刘苏. 用于动态场景高动态范围成像的局部熵引导的双分支网络[J]. 《计算机应用》唯一官方网站, 2025, 45(1): 204-213.
[8]	张嘉琳, 任庆桦, 毛启容. 利用全局-局部特征依赖的反欺骗说话人验证系统[J]. 《计算机应用》唯一官方网站, 2025, 45(1): 308-317.
[9]	王丽芳, 吴荆双, 尹鹏亮, 胡立华. 基于注意力机制和能量函数的动作识别算法[J]. 《计算机应用》唯一官方网站, 2025, 45(1): 234-239.
[10]	宋鹏程, 郭立君, 张荣. 利用局部-全局时间依赖的弱监督视频异常检测[J]. 《计算机应用》唯一官方网站, 2025, 45(1): 240-246.
[11]	徐杰, 钟勇, 王阳, 张昌福, 杨观赐. 基于上下文通道注意力机制的人脸属性估计与表情识别[J]. 《计算机应用》唯一官方网站, 2025, 45(1): 253-260.
[12]	陈俊颖, 郭士杰, 陈玲玲. 基于解耦注意力与幻影卷积的轻量级人体姿态估计[J]. 《计算机应用》唯一官方网站, 2025, 45(1): 223-233.
[13]	胡健鹏, 张立臣. 面向多时间步风功率预测的深度时空网络模型[J]. 《计算机应用》唯一官方网站, 2025, 45(1): 98-105.
[14]	赵志强, 马培红, 黑新宏. 基于双重注意力机制的人群计数方法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2886-2892.
[15]	秦璟, 秦志光, 李发礼, 彭悦恒. 基于概率稀疏自注意力神经网络的重性抑郁疾患诊断[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2970-2974.