《计算机应用》唯一官方网站 ›› 2025, Vol. 45 ›› Issue (3): 988-995.DOI: 10.11772/j.issn.1001-9081.2024030358

• 多媒体计算与计算机仿真 • 上一篇    下一篇

融合多尺度语义和双分支并行的医学图像分割网络

袁宝华1(), 陈佳璐1, 王欢2   

  1. 1.常州大学 计算机与人工智能学院,江苏 常州 213159
    2.南京理工大学 计算机科学与工程学院,南京 210094
  • 收稿日期:2024-04-01 修回日期:2024-05-26 接受日期:2024-05-29 发布日期:2024-06-17 出版日期:2025-03-10
  • 通讯作者: 袁宝华
  • 作者简介:陈佳璐(1997—),男,湖南娄底人,硕士研究生,CCF会员,主要研究方向:医学图像分割
    王欢(1982—),男,江苏镇江人,副教授,博士,主要研究方向:计算机视觉。
  • 基金资助:
    国家自然科学基金资助项目(61703209)

Medical image segmentation network integrating multi-scale semantics and parallel double-branch

Baohua YUAN1(), Jialu CHEN1, Huan WANG2   

  1. 1.School of Computer Science and Artificial Intelligence,Changzhou University,Changzhou Jiangsu 213159,China
    2.School of Computer Science and Engineering,Nanjing University of Science and Technology,Nanjing Jiangsu 210094,China
  • Received:2024-04-01 Revised:2024-05-26 Accepted:2024-05-29 Online:2024-06-17 Published:2025-03-10
  • Contact: Baohua YUAN
  • About author:CHEN Jialu, born in 1997, M. S. candidate. His research interests include medical image segmentation.
    WANG Huan, born in 1982, Ph. D., associate professor. His research interests include computer vision.
  • Supported by:
    National Natural Science Foundation of China(61703209)

摘要:

在医学图像分割网络中,卷积神经网络(CNN)虽然能提取丰富的局部特征细节,但存在远程信息捕获不足的问题。Transformer虽然可以捕捉长距离的全局特征依赖关系,但是会破坏局部特征细节。为充分利用2种网络特征的互补性,提出一种用于医学图像分割的CNN和Transformer并行的融合网络——PFNet。该网络的并行融合模块使用一对基于CNN和Transformer的相互依赖的并行分支来高效地学习局部和全局两方面的辨别特征,并以交互方式交叉融合局部特征和长距离特征的依赖关系;同时,为恢复在下采样期间丢失的空间信息以增强细节的保留,提出多尺度交互(MSI)模块提取分层CNN分支生成的多尺度特征的局部上下文以进行远程依赖关系建模。实验结果表明,PFNet优于MISSFormer(Medical Image Segmentation tranSFormer)和UCTransNet(U-Net with Channel Transformer module)等先进方法。在Synapse和ACDC(Automated Cardiac Diagnosis Challenge)数据集上,相较于最优的基线方法MISSFormer,PFNet的平均Dice相似系数(DSC)分别提高1.27%和0.81%。可见,PFNet能实现更精准的医学图像分割。

关键词: 医学图像分割, Transformer, 卷积神经网络, 并行融合, 多尺度交互

Abstract:

In medical image segmentation networks, Convolutional Neural Network (CNN) can extract rich local feature details, but has the problem of insufficient capture of long-range information, and Transformer can capture long-range global feature dependencies, but destroys local feature details. To make full use of the complementarity of characteristics of the two networks, a parallel fusion network of CNN and Transformer for medical image segmentation was proposed, named PFNet. In the parallel fusion module of this network, a pair of interdependent parallel branches based on CNN and Transformer were used to learn both local and global discriminative features efficiently, and fuse local features and long-distance feature dependencies interactively. At the same time, to recover the spatial information lost during downsampling to enhance detail retention, a Multi-Scale Interaction (MSI) module was proposed to extract the local context of multi-scale features generated by hierarchical CNN branches for long-range dependency modeling. Experimental results show that PFNet outperforms other advanced methods such as MISSFormer (Medical Image Segmentation tranSFormer) and UCTransNet (U-Net with Channel Transformer module). On Synapse and ACDC (Automated Cardiac Diagnosis Challenge) datasets, compared to the optimal baseline method MISSFormer, PFNet increases the average Dice Similarity Coefficient (DSC) by 1.27% and 0.81%, respectively. It can be seen that PFNet can realize more accurate medical image segmentation.

Key words: medical image segmentation, Transformer, Convolutional Neural Network (CNN), parallel fusion, Multi-Scale Interaction (MSI)

中图分类号: