Improved U-Net algorithm based on attention mechanism and multi-scale fusion

doi:10.11772/j.issn.1001-9081.2022121844

Journal of Computer Applications ›› 0, Vol. ›› Issue (): 24-28.DOI: 10.11772/j.issn.1001-9081.2022121844

Special Issue: 数据科学与技术

• Artificial intelligence • Previous Articles Next Articles

Improved U-Net algorithm based on attention mechanism and multi-scale fusion

Song WU¹^,², Xin LAN¹^,², Jingyang SHAN¹^,², Haiwen XU³()

^1.Chengdu Institute of Computer Application，Chinese Academy of Sciences，Chengdu Sichuan 610213，China
^2.School of Computer Science and Technology，University of Chinese Academy of Sciences，Beijing 100049，China
^3.Faculty of Science，Civil Aviation Flight University of China，Guanghan Sichuan 618307，China

Received:2024-01-10 Revised:2024-02-08 Accepted:2024-02-29 Online:2023-03-13 Published:2024-12-31
Contact: Haiwen XU

基于注意力机制和多尺度融合的U-Net改进算法

吴淞¹^,², 蓝鑫¹^,², 单靖杨¹^,², 徐海文³()

^1.中国科学院成都计算机应用研究所，成都 610213
^2.中国科学院大学计算机科学与技术学院，北京 100049
^3.中国民用航空飞行学院理学院，四川广汉 618307

通讯作者: 徐海文
作者简介:吴淞（1995—），男，湖北利川人，硕士研究生，CCF会员，主要研究方向：深度学习、图像分割
蓝鑫（1998—），女，福建龙岩人，博士研究生，CCF会员，主要研究方向：深度学习、目标检测
单靖杨（1997—），男，四川成都人，博士研究生，CCF会员，主要研究方向：深度学习、小样本学习
徐海文（1978—），男，山东菏泽人，教授，博士，主要研究方向：大数据模型与算法、最优化理论与算法。
基金资助:
成都市-中国科学院科技合作资金资助项目;民航飞行技术与飞行安全重点实验室项目(FZ2022ZZ05)

Abstract

Abstract:

Aiming at the problems of computational redundancy and difficulty in segmenting fine structures of the original U-Net in medical image segmentation tasks， an improved U-Net algorithm based on attention mechanism and multi-scale fusion was proposed. Firstly， by integrating channel attention mechanism into the skip connections， the channels containing more important information were focused by the network， thereby reducing computational resource cost and improving computational efficiency. Secondly， the feature fusion strategy was added to increase the contextual information for the feature maps passed to the decoder， which realized the complementary and multiple utilization among the features. Finally， the joint optimization was performed by using Dice loss and binary cross entropy loss， so as to handle with the problem of dramatic oscillations of loss function that may occur in fine structure segmentation. Experimental validation results on Kvasir_seg and DRIVE datasets show that compared with the original U-Net algorithm， the proposed improved algorithm has the Dice coefficient increased by 1.82 and 0.82 percentage points， the SEnsitivity （SE） improved by 1.94 and 3.53 percentage points， and the Accuracy （Acc） increased by 1.62 and 0.04 percentage points， respectively. It can be seen that the proposed improved algorithm can enhance performance of the original U-Net for fine structure segmentation.

Key words: deep learning, medical image segmentation, U-Net, channel attention mechanism, multi-scale fusion

摘要：

针对原始U-Net在医学图像分割任务中计算冗余和难以划分细小结构等问题，提出一种基于注意力机制和多尺度融合的U-Net改进算法。首先，通过在跳跃路径上引入通道注意力机制，网络关注包含更重要信息的通道，从而减少计算资源开销，并提升计算效率；其次，增加特征融合策略为传递给解码器的特征图增加上下文信息，从而实现特征之间的互补和多重利用；最后，使用Dice损失和二元交叉熵损失进行联合优化，以应对细小结构分割时可能出现的损失函数剧烈振荡问题。在Kvasir_seg和DRIVE数据集上进行的实验验证的结果表明，与原始U-Net算法相比，所提改进算法的Dice系数分别提高了1.81和0.82个百分点，灵敏度（SE）分别提高了1.94和3.53个百分点，准确度（Acc）分别提高了1.62和0.04个百分点。可见，所提改进算法能够提升原始U-Net对于细小结构分割的性能。

关键词: 深度学习, 医学图像分割, U-Net, 通道注意力机制, 多尺度融合

CLC Number:

TP183

Song WU, Xin LAN, Jingyang SHAN, Haiwen XU. Improved U-Net algorithm based on attention mechanism and multi-scale fusion[J]. Journal of Computer Applications, 0, (): 24-28.

吴淞, 蓝鑫, 单靖杨, 徐海文. 基于注意力机制和多尺度融合的U-Net改进算法[J]. 《计算机应用》唯一官方网站, 0, (): 24-28.

Figures/Tables 8

References 26

1	LONG J， SHELHAMER E， DARRELL T. Fully convolutional networks for semantic segmentation［C］// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2015： 3431-3440.
2	RONNEBERGER O， FISCHER P， BROX T. U-Net： convolutional networks for biomedical image segmentation［C］// Proceedings of the 2015 International Conference on Medical Image Computing and Computer-Assisted Intervention， LNCS 9351. Cham： Springer， 2015： 234-241.
3	ÇIÇEK Ö， ABDULKADIR A， LIENKAMP S S， et al. 3D U-Net： learning dense volumetric segmentation from sparse annotation［C］// Proceedings of the 2016 International Conference on Medical Image Computing and Computer-Assisted Intervention， LNCS 9901. Cham： Springer， 2016： 424-432.
4	OKTAY O， SCHLEMPER J， LE FOLGOC L， et al. Attention U-Net： learning where to look for the pancreas［EB/OL］. ［2023-12-19］..
5	ALOM M Z， HASAN M， YAKOPCIC C， et al. Recurrent Residual convolutional neural network based on U-Net （R2U-Net） for medical image segmentation［EB/OL］. ［2023-12-19］..
6	ZHOU Z， RAHMAN SIDDIQUEE M M， TAJBAKHSH N， et al. UNet++： a nested U-Net architecture for medical image segmentation［C］// Proceedings of the 2018 International Workshop on Deep Learning in Medical Image Analysis/ International Workshop on Multimodal Learning for Clinical Decision Support， LNCS 11045. Cham： Springer， 2018： 3-11.
7	CHEN J， LU Y， YU Q， et al. TransUNet： Transformers make strong encoders for medical image segmentation［EB/OL］. ［2023-12-24］..
8	欧宇轩，高敏，赵地，等. SA-TF-UNet：基于空间注意力机制和Transformer的MRI海马体分割［J］. 中国图象图形学报， 2023， 28（10）： 3191-3202.
9	LIU W， ANGUELOV D， ERHAN D， et al. SSD： single shot MultiBox detector［C］// Proceedings of the 2016 European Conference on Computer Vision， LNCS 9905. Cham： Springer， 2016： 21-37.
10	FU C Y， LIU W， RANGA A， et al. DSSD： deconvolutional single shot detector［EB/OL］. ［2023-12-22］..
11	LI Z， ZHOU F. FSSD： feature fusion single shot MultiBox detector ［EB/OL］. ［2023-12-13］..
12	LIU S， HUANG D， WANG Y. Receptive field block net for accurate and fast object detection［C］// Proceedings of the 2018 European Conference on Computer Vision， LNCS 11215. Cham： Springer， 2018： 4404-419.
13	ZHANG S， WEN L， BIAN X， et al. Single-shot refinement neural network for object detection［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 4203-4212.
14	ZHAO Q， SHENG T， WANG Y， et al. M2Det： a single-shot object detector based on multi-level feature pyramid network［C］// Proceedings of the 33rd AAAI Conference on Artificial Intelligence. Palo Alto： AAAI Press， 2019： 9259-9266.
15	REDMON J， DIVVALA S， GIRSHICK R， et al. You only look once： unified， real-time object detection［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 779-788.
16	LIM J S， ASTRID M， YOON H J， et al. Small object detection using context and attention［C］// Proceedings of the 2021 International Conference on Artificial Intelligence in Information and Communication. Piscataway： IEEE， 2021： 181-186.
17	GLOROT X， BORDES A， BENGIO Y. Deep sparse rectifier neural networks［C］// Proceedings of the 14th International Workshop and Conference Proceedings on Artificial Intelligence and Statistics. New York： JMLR.org， 2011： 315-323.
18	HU J， SHEN L， SUN G. Squeeze-and-excitation networks［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 7132-7141.
19	LIN T Y， DOLLÁR P， GIRSHICK R， et al. Feature pyramid networks for object detection ［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 936-944.
20	CHEN L C， PAPANDREOU G， KOKKINOS I， et al. DeepLab： semantic image segmentation with deep convolutional nets， atrous convolution， and fully connected CRFs［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2018， 40（4）： 834-848.
21	WU D， CAO L， ZHOU P， et al. Infrared small-target detection based on radiation characteristics with a multimodal feature fusion network［J］. Remote Sensing， 2022， 14（15）： No.3570.
22	JADON S. A survey of loss functions for semantic segmentation［C］// Proceedings of the 2020 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology. Piscataway： IEEE， 2020： 1-7.
23	兰冬雷，王晓东，姚宇，等. 基于邻近切片注意力融合的直肠癌分割网络［J］. 计算机应用， 2023， 43（12）： 3918-3926.
24	方超伟，李雪，李钟毓，等. 基于双模型交互学习的半监督医学图像分割［J］. 自动化学报， 2023， 49（4）： 805-819.
25	JHA D， SMEDSRUD P H， RIEGLER M A， et al. Kvasir-SEG： a segmented polyp dataset［C］// Proceedings of the 2020 International Conference on MultiMedia Modeling， LNCS 11962. Cham： Springer， 2020： 451-462.
26	STAAL J， ABRÀMOFF M D， NIEMEIJER M， et al. Ridge-based vessel segmentation in color images of the retina［J］. IEEE Transactions on Medical Imaging， 2004， 23（4）： 501-509.

数据集	算法	Dice	SE	Acc
Kvasir_seg	U-Net^［2］	0.833 2	0.826 7	0.933 7
	UNet++^［6］	0.857 2	0.825 4	0.940 9
	FAUNet	0.851 3	0.846 1	0.949 9
DRIVE	U-Net	0.804 7	0.772 9	0.952 3
	UNet++	0.807 8	0.801 7	0.951 5
	FAUNet	0.812 9	0.808 2	0.952 7

数据集	算法	Dice	SE	Acc
Kvasir_seg	U-Net^［2］	0.833 2	0.826 7	0.933 7
	UNet++^［6］	0.857 2	0.825 4	0.940 9
	FAUNet	0.851 3	0.846 1	0.949 9
DRIVE	U-Net	0.804 7	0.772 9	0.952 3
	UNet++	0.807 8	0.801 7	0.951 5
	FAUNet	0.812 9	0.808 2	0.952 7

算法	+注意力模块	+特征融合模块	Dice	SE	Acc
U-Net			0.8047	0.772 9	0.952 3
	√		0.791 1	0.736 6	0.950 5
		√	0.809 7	0.799 1	0.952 2
	√	√	0.812 9	0.808 2	0.952 7

算法	+注意力模块	+特征融合模块	Dice	SE	Acc
U-Net			0.8047	0.772 9	0.952 3
	√		0.791 1	0.736 6	0.950 5
		√	0.809 7	0.799 1	0.952 2
	√	√	0.812 9	0.808 2	0.952 7

[1]	Kai CHEN, Hailiang YE, Feilong CAO. Classification algorithm for point cloud based on local-global interaction and structural Transformer [J]. Journal of Computer Applications, 2025, 45(5): 1671-1676.
[2]	Sijie NIU, Yuliang LIU. Auxiliary diagnostic method for retinopathy based on dual-branch structure with knowledge distillation [J]. Journal of Computer Applications, 2025, 45(5): 1410-1414.
[3]	Wenpeng WANG, Yinchang QIN, Wenxuan SHI. Review of unsupervised deep learning methods for industrial defect detection [J]. Journal of Computer Applications, 2025, 45(5): 1658-1670.
[4]	Xueying LI, Kun YANG, Guoqing TU, Shubo LIU. Adversarial sample generation method for time-series data based on local augmentation [J]. Journal of Computer Applications, 2025, 45(5): 1573-1581.
[5]	Dan WANG, Wenhao ZHANG, Lijuan PENG. Channel estimation of reconfigurable intelligent surface assisted communication system based on deep learning [J]. Journal of Computer Applications, 2025, 45(5): 1613-1618.
[6]	Lihu PAN, Shouxin PENG, Rui ZHANG, Zhiyang XUE, Xuzhen MAO. Video anomaly detection for moving foreground regions [J]. Journal of Computer Applications, 2025, 45(4): 1300-1309.
[7]	Yiding WANG, Zehao WANG, Yaoli LI, Shaoqing CAI, Yuan YUAN. Multi-scale 2D-Adaboost microscopic image recognition algorithm of Chinese medicinal materials powder [J]. Journal of Computer Applications, 2025, 45(4): 1325-1332.
[8]	Yang ZHOU, Hui LI. Remote sensing image building extraction network based on dual promotion of semantic and detailed features [J]. Journal of Computer Applications, 2025, 45(4): 1310-1316.
[9]	Kunyuan JIANG, Xiaoxia LI, Li WANG, Yaodan CAO, Xiaoqiang ZHANG, Nan DING, Yingyue ZHOU. Boundary-cross supervised semantic segmentation network with decoupled residual self-attention [J]. Journal of Computer Applications, 2025, 45(4): 1120-1129.
[10]	Baohua YUAN, Jialu CHEN, Huan WANG. Medical image segmentation network integrating multi-scale semantics and parallel double-branch [J]. Journal of Computer Applications, 2025, 45(3): 988-995.
[11]	Ruilong CHEN, Tao HU, Youjun BU, Peng YI, Xianjun HU, Wei QIAO. Stacking ensemble adversarial defense method for encrypted malicious traffic detection model [J]. Journal of Computer Applications, 2025, 45(3): 864-871.
[12]	Zhenhua XUE, Qiang LI, Chao HUANG. Vision foundation model-driven pixel-level image anomaly detection method [J]. Journal of Computer Applications, 2025, 45(3): 823-831.
[13]	Zirong HONG, Guangqing BAO. Review of radar automatic target recognition based on ensemble learning [J]. Journal of Computer Applications, 2025, 45(2): 371-382.
[14]	Zhongwei ZHANG, Jun WANG, Shudong LIU, Zhiheng WANG. Object detection in remote sensing image based on multi-scale feature fusion and weighted boxes fusion [J]. Journal of Computer Applications, 2025, 45(2): 633-639.
[15]	Tianqi ZHANG, Shuang TAN, Xiwen SHEN, Juan TANG. Image watermarking method combining attention mechanism and multi-scale feature [J]. Journal of Computer Applications, 2025, 45(2): 616-623.

Improved U-Net algorithm based on attention mechanism and multi-scale fusion

基于注意力机制和多尺度融合的U-Net改进算法

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 8

References 26

Related Articles 15

Recommended Articles

Metrics