Multi-focus image fusion network with cascade fusion and enhanced reconstruction

doi:10.11772/j.issn.1001-9081.2024030302

Abstract

Abstract:

Aiming at the problem of semi-focus images caused by improper focusing of far and near visual fields during digital image shooting， a multi-focus image fusion Network with Cascade fusion and enhanced reconstruction （CasNet） was proposed. Firstly， a cascade sampling module was constructed to calculate and merge the residuals of feature maps sampled at different depths for efficient utilization of focused features at different scales. Secondly， a lightweight multi-head self-attention mechanism was improved to perform dimensional residual calculation on feature maps for feature enhancement of the image and make the feature maps present better distribution in different dimensions. Thirdly， convolution channel attention stacking was used to complete feature reconstruction. Finally， interval convolution was used for up- and down-sampling during the sampling process， so as to retain more original image features. Experimental results demonstrate that CasNet achieves better results in metrics such as Average Gradient （AG） and Gray-Level Difference （GLD） on multi-focus image benchmark test sets Lytro， MFFW， grayscale， and MFI-WHU compared to popular methods such as SESF-Fuse （Spatially Enhanced Spatial Frequency-based Fusion） and U2Fusion （Unified Unsupervised Fusion network）.

Key words: multi-focus image fusion, Deep Neural Network (DNN), feature reconstruction, feature enhancement, attention

摘要：

针对数字图像拍摄过程中因远近视野聚焦不当所导致的半聚焦图像问题，提出一种级联融合与增强重建的多聚焦图像融合网络（CasNet）。首先，构建级联采样模块对不同深度采样特征图的残差进行计算与合并，从而高效利用不同尺度下的聚焦特征；其次，改进轻量化多头自注意力机制以计算特征图的维度残差，从而完成图像的特征增强，并使特征图在不同维度上呈现更优分布；再次，使用卷积通道注意力堆叠完成特征重建；最后，在采样过程中使用分隔卷积进行上下采样，从而保留更多的图像原有特征。实验结果表明，在多聚焦图像基准测试集Lytro、MFFW、grayscale和MFI-WHU上，CasNet相较于SESF-Fuse（Spatially Enhanced Spatial Frequency-based Fusion）和U2Fusion（Unified Unsupervised Fusion network）等热门方法在平均梯度（AG）、灰度级差（GLD）等指标上都取得了较好的结果。

关键词: 多聚焦图像融合, 深度神经网络, 特征重建, 特征增强, 注意力

CLC Number:

TP391.4

Benchen YANG, Haoran LI, Haibo JIN. Multi-focus image fusion network with cascade fusion and enhanced reconstruction[J]. Journal of Computer Applications, 2025, 45(2): 594-600.

杨本臣, 李浩然, 金海波. 级联融合与增强重建的多聚焦图像融合网络[J]. 《计算机应用》唯一官方网站, 2025, 45(2): 594-600.

Figures/Tables 10

References 34

1	LIU Y， WANG L， CHENG J， et al. Multi-focus image fusion： a survey of the state of the art［J］. Information Fusion， 2020， 64： 71-91.
2	ZHOU Y， YU L， ZHI C， et al. A survey of multi-focus image fusion methods［J］. Applied Sciences， 2022， 12（12）： No.6281.
3	LI H， MANJUNATH B S， MITRA S K. Multisensor image fusion using the wavelet transform［J］. Graphical Models and Image Processing， 1995， 57（3）： 235-245.
4	LIU Y， LIU S， WANG Z. Multi-focus image fusion with dense SIFT［J］. Information Fusion， 2015， 23： 139-155.
5	LIU Y， CHEN X， PENG H， et al. Multi-focus image fusion with a deep convolutional neural network［J］. Information Fusion， 2017， 36： 191-207.
6	ZHAO W， WANG D， LU H. Multi-focus image fusion with a natural enhancement via a joint multi-level deeply supervised convolutional neural network［J］. IEEE Transactions on Circuits and Systems for Video Technology， 2019， 29（4）： 1102-1115.
7	ZHANG H， LE Z， SHAO Z， et al. MFF-GAN： an unsupervised generative adversarial network with adaptive and gradient joint constraints for multi-focus image fusion［J］. Information Fusion， 2021， 66： 40-53.
8	GOODFELLOW I J， POUGET-ABADIE J， MIRZA M， et al. Generative adversarial nets［C］// Proceedings of the 27th International Conference on Neural Information Processing Systems. Cambridge： MIT Press， 2014： 2672-2680.
9	ZHANG X. Deep learning-based multi-focus image fusion： a survey and a comparative study［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2022， 44（9）： 4819-4838.
10	ZHANG Q， GUO B L. Multifocus image fusion using the nonsubsampled contourlet transform［J］. Signal Processing， 2009， 89（7）： 1334-1346.
11	WANG W， CHANG F. A multi-focus image fusion method based on Laplacian pyramid［J］. Journal of Computers， 2011， 6（12）： 2559-2566.
12	CUI G， FENG H， XU Z， et al. Detail preserved fusion of visible and infrared images using regional saliency extraction and multi-scale image decomposition［J］. Optics Communications， 2015， 341： 199-209.
13	LI S， KANG X， HU J. Image fusion with guided filtering［J］. IEEE Transactions on Image Processing， 2013， 22（7）： 2864-2875.
14	ZHAN K， KONG L， LIU B， et al. Multimodal image seamless fusion［J］. Journal of Electronic Imaging， 2019， 28（2）： No.023027.
15	MA B， ZHU Y， YIN X， et al. SESF-Fuse： an unsupervised deep model for multi-focus image fusion［J］. Neural Computing and Applications， 2021， 33（11）： 5793-5804.
16	ZHANG Y， LIU Y， SUN P， et al. IFCNN： a general image fusion framework based on convolutional neural network［J］. Information Fusion， 2020， 54： 99-118.
17	XU H， MA J， JIANG J， et al. U2Fusion： a unified unsupervised image fusion network［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2022， 44（1）： 502-518.
18	WANG Z， CHEN J， HOI S C H. Deep learning for image super-resolution： a survey［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2021， 43（10）： 3365-3387.
19	DONG C， LOY C C， HE K， et al. Image super-resolution using deep convolutional networks［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2016， 38（2）： 295-307.
20	KIM J， LEE J K， LEE K M. Accurate image super-resolution using very deep convolutional networks［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 1646-1654.
21	HE K， ZHANG X， REN S， et al. Deep residual learning for image recognition［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 770-778.
22	LIM B， SON S， KIM H， et al. Enhanced deep residual networks for single image super-resolution［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 1132-1140.
23	GENDY G， HE G， SABOR N. Lightweight image super-resolution based on deep learning： state-of-the-art and future directions［J］. Information Fusion， 2023， 94： 284-310.
24	ZAMIR S W， ARORA A， KHAN S， et al. Restormer： efficient Transformer for high-resolution image restoration［C］// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2022： 5718-5729.
25	LU Z， LI J， LIU H， et al. Transformer for single image super-resolution［C］// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Piscataway： IEEE， 2022： 456-465.
26	KINGAD D P， BA J L. Adam： a method for stochastic optimization［EB/OL］. ［2023-11-03］..
27	EVERINGHAM M， ESLAMI S M A， VAN GOOL L， et al. The PASCAL visual object classes challenge： a retrospective［J］. International Journal of Computer Vision， 2015， 111（1）： 98-136.
28	WANG Y， XU S， LIU J， et al. MFIF-GAN： a new generative adversarial network for multi-focus image fusion［J］. Signal Processing： Image Communication， 2021， 96： No.116295.
29	NEJATI M， SAMAVI S， SHIRANI S. Multi-focus image fusion using dictionary-based sparse representation［J］. Information Fusion， 2015， 25： 72-84.
30	XU S， WEI X， ZHANG C， et al. MFFW： a new dataset for multi-focus image fusion［EB/OL］. ［2023-02-12］..
31	JADON A， PATIL A， JADON S. A comprehensive survey of regression based loss functions for time series forecasting［C］// Proceedings of the 2024 International Conference on Data Management， Analytics and Innovation， LNNS 998. Cham： Springer， 2024： 117-147.
32	WANG Z， BOVIK A C， SHEIKH H R， et al. Image quality assessment： from error visibility to structural similarity［J］. IEEE Transactions on Image Processing， 2004， 13（4）： 600-612.
33	MA H， LIAO Q， ZHANG J， et al. An α-matte boundary defocus model-based cascaded network for multi-focus image fusion［J］. IEEE Transactions on Image Processing， 2020， 29： 8668-8679.
34	BAI X， ZHOU F， XUE B. Noise-suppressed image enhancement using multiscale top-hat selection transform through region extraction［J］. Applied Optics， 2012， 51（3）： 338-347.

数据集	方法	AG	GLD	MSD	LIF
Lytro	DWT	2.892 015	14.317 518	0.110 454	0.410 464
	GF	2.886 802	14.287 715	0.110 801	0.407 880
	DSIFT	2.890 632	14.306 651	0.110 830	0.407 591
	NSCT	2.876 968	14.244 632	0.110 496	0.410 090
	IFCNN	2.891 815	14.316 606	0.110 749	0.407 939
	SESF-Fuse	2.888 225	14.294 464	0.110 864	0.407 565
	U2Fusion	2.286 115	11.307 678	0.099 932	0.427 310
	MFF-GAN	2.726 301	13.505 882	0.106 499	0.411 883
	CasNet	3.008 855	14.905 202	0.111 582	0.406 753
MFFW	DWT	3.497 837	17.372 400	0.119 985	0.414 033
	GF	3.446 860	17.131 368	0.118 574	0.413 859
	DSIFT	3.458 350	17.185 319	0.118 444	0.412 197
	NSCT	3.429 046	17.044 636	0.120 315	0.414 269
	IFCNN	3.421 395	17.005 448	0.121 998	0.410 692
	SESF-Fuse	3.508 006	17.433 581	0.119 363	0.407 991
	U2Fusion	2.633 200	13.077 647	0.108 532	0.431 687
	MFF-GAN	3.365 119	16.768 212	0.115 540	0.404 708
	CasNet	3.622 536	18.007 483	0.123 359	0.415 832
grayscale	DWT	3.920 240	19.336 019	0.154 317	0.457 913
	GF	3.828 959	18.869 385	0.153 370	0.457 169
	DSIFT	3.848 095	18.965 958	0.153 655	0.456 716
	NSCT	3.834 118	18.900 831	0.154 379	0.457 600
	IFCNN	3.869 484	19.087 379	0.154 687	0.456 762
	SESF-Fuse	3.830 764	18.880 392	0.153 642	0.456 556
	U2Fusion	2.951 674	14.522 842	0.140 666	0.472 399
	MFF-GAN	3.870 514	19.118 118	0.149 254	0.459 194
	CasNet	4.137 110	20.435 776	0.158 180	0.451 321
MFI-WHU	DWT	4.147 645	20.625 134	0.100 404	0.482 206
	GF	4.131 233	20.532 169	0.100 373	0.482 510
	DSIFT	4.099 983	20.403 363	0.100 261	0.483 256
	NSCT	4.141 690	20.589 193	0.100 430	0.482 193
	IFCNN	4.110 629	20.464 973	0.100 824	0.480 513
	SESF-Fuse	4.098 331	20.392 291	0.100 376	0.482 165
	U2Fusion	2.906 110	14.379 773	0.091 892	0.500 723
	MFF-GAN	3.883 448	19.301 371	0.098 425	0.479 748
	CasNet	4.422 203	21.971 969	0.101 070	0.480 715

数据集	方法	AG	GLD	MSD	LIF
Lytro	DWT	2.892 015	14.317 518	0.110 454	0.410 464
	GF	2.886 802	14.287 715	0.110 801	0.407 880
	DSIFT	2.890 632	14.306 651	0.110 830	0.407 591
	NSCT	2.876 968	14.244 632	0.110 496	0.410 090
	IFCNN	2.891 815	14.316 606	0.110 749	0.407 939
	SESF-Fuse	2.888 225	14.294 464	0.110 864	0.407 565
	U2Fusion	2.286 115	11.307 678	0.099 932	0.427 310
	MFF-GAN	2.726 301	13.505 882	0.106 499	0.411 883
	CasNet	3.008 855	14.905 202	0.111 582	0.406 753
MFFW	DWT	3.497 837	17.372 400	0.119 985	0.414 033
	GF	3.446 860	17.131 368	0.118 574	0.413 859
	DSIFT	3.458 350	17.185 319	0.118 444	0.412 197
	NSCT	3.429 046	17.044 636	0.120 315	0.414 269
	IFCNN	3.421 395	17.005 448	0.121 998	0.410 692
	SESF-Fuse	3.508 006	17.433 581	0.119 363	0.407 991
	U2Fusion	2.633 200	13.077 647	0.108 532	0.431 687
	MFF-GAN	3.365 119	16.768 212	0.115 540	0.404 708
	CasNet	3.622 536	18.007 483	0.123 359	0.415 832
grayscale	DWT	3.920 240	19.336 019	0.154 317	0.457 913
	GF	3.828 959	18.869 385	0.153 370	0.457 169
	DSIFT	3.848 095	18.965 958	0.153 655	0.456 716
	NSCT	3.834 118	18.900 831	0.154 379	0.457 600
	IFCNN	3.869 484	19.087 379	0.154 687	0.456 762
	SESF-Fuse	3.830 764	18.880 392	0.153 642	0.456 556
	U2Fusion	2.951 674	14.522 842	0.140 666	0.472 399
	MFF-GAN	3.870 514	19.118 118	0.149 254	0.459 194
	CasNet	4.137 110	20.435 776	0.158 180	0.451 321
MFI-WHU	DWT	4.147 645	20.625 134	0.100 404	0.482 206
	GF	4.131 233	20.532 169	0.100 373	0.482 510
	DSIFT	4.099 983	20.403 363	0.100 261	0.483 256
	NSCT	4.141 690	20.589 193	0.100 430	0.482 193
	IFCNN	4.110 629	20.464 973	0.100 824	0.480 513
	SESF-Fuse	4.098 331	20.392 291	0.100 376	0.482 165
	U2Fusion	2.906 110	14.379 773	0.091 892	0.500 723
	MFF-GAN	3.883 448	19.301 371	0.098 425	0.479 748
	CasNet	4.422 203	21.971 969	0.101 070	0.480 715

编号	采样方式	CSL	FEL	FRL	AG	GLD	MSD	LIF
1	分隔卷积	√	√	√	3.008 855	14.905 202	0.111 582	0.406 753
2	卷积池化	√	√	√	2.836 003	14.032 692	0.112 648	0.408 859
3	分隔卷积	√	×	√	2.881 604	14.299 073	0.110 952	0.407 592
4	分隔卷积	√	√	×	2.885 072	14.279 921	0.111 756	0.409 389
5	分隔卷积	√	×	×	2.757 684	13.642 471	0.109 382	0.413 323

编号	采样方式	CSL	FEL	FRL	AG	GLD	MSD	LIF
1	分隔卷积	√	√	√	3.008 855	14.905 202	0.111 582	0.406 753
2	卷积池化	√	√	√	2.836 003	14.032 692	0.112 648	0.408 859
3	分隔卷积	√	×	√	2.881 604	14.299 073	0.110 952	0.407 592
4	分隔卷积	√	√	×	2.885 072	14.279 921	0.111 756	0.409 389
5	分隔卷积	√	×	×	2.757 684	13.642 471	0.109 382	0.413 323

维度	AG	GLD	MSD	LIF
32	2.961 190	14.348 444	0.111 700	0.408 259
64	3.008 855	14.905 202	0.111 582	0.406 753
128	3.033 247	15.032 255	0.111 764	0.406 228