Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (5): 1584-1595.DOI: 10.11772/j.issn.1001-9081.2022040530

• Multimedia computing and computer simulation • Previous Articles     Next Articles

Transformer based U-shaped medical image segmentation network: a survey

Liyao FU1, Mengxiao YIN1,2, Feng YANG1,2()   

  1. 1.School of Computer,Electronics and Information,Guangxi University,Nanning Guangxi 530004,China
    2.Guangxi Key Laboratory of Multimedia Communications and Network Technology (Guangxi University),Nanning Guangxi 530004,China
  • Received:2022-04-18 Revised:2022-07-02 Accepted:2022-07-04 Online:2022-07-26 Published:2023-05-10
  • Contact: Feng YANG
  • About author:FU Liyao, born in 1998, M. S. candidate. Her research interests include computer vision, medical image segmentation.
    YIN Mengxiao, born in 1978, Ph. D., associate professor. Her research interests include computer graphics and virtual reality, digital geometry processing, image and video editing, graph theory and its applications.
    YANG Feng, born in 1979, Ph. D., associate professor. His research interests include artificial intelligence, network information security, big data and high-performance computing, precision medicine.
  • Supported by:
    National Natural Science Foundation of China(61861004)

基于Transformer的U型医学图像分割网络综述

傅励瑶1, 尹梦晓1,2, 杨锋1,2()   

  1. 1.广西大学 计算机与电子信息学院,南宁 530004
    2.广西多媒体通信与网络技术重点实验室(广西大学),南宁 530004
  • 通讯作者: 杨锋
  • 作者简介:傅励瑶(1998—),女,重庆人,硕士研究生,主要研究方向:计算机视觉、医学图像分割
    尹梦晓(1978—),女,河南南阳人,副教授,博士,CCF会员,主要研究方向:计算机图形学与虚拟现实、数字几何处理、图像与视频编辑、图论及其应用
    杨锋(1979—),男,广西玉林人,副教授,博士,CCF会员,主要研究方向:人工智能、网络信息安全、大数据与高性能计算、精准医学。fyang@foxmail.com
  • 基金资助:
    国家自然科学基金资助项目(61861004)

Abstract:

U-shaped Network (U-Net) based on Fully Convolutional Network (FCN) is widely used as the backbone of medical image segmentation models, but Convolutional Neural Network (CNN) is not good at capturing long-range dependency, which limits the further performance improvement of segmentation models. To solve the above problem, researchers have applied Transformer to medical image segmentation models to make up for the deficiency of CNN, and U-shaped segmentation networks combining Transformer have become the hot research topics. After a detailed introduction of U-Net and Transformer, the related medical image segmentation models were categorized by the position in which the Transformer module was located, including only in the encoder or decoder, both in the encoder and decoder, as a skip-connection, and others, the basic contents, design concepts and possible improvement aspects about these models were discussed, the advantages and disadvantages of having Transformer in different positions were also analyzed. According to the analysis results, it can be seen that the biggest factor to decide the position of Transformer is the characteristics of the target segmentation task, and the segmentation models of Transformer combined with U-Net can make better use of the advantages of CNN and Transformer to improve segmentation performance of models, which has great development prospect and research value.

Key words: deep learning, Convolutional Neural Network (CNN), medical image segmentation, U-shaped Network (U-Net), Transformer

摘要:

目前,医学图像分割模型广泛采用基于全卷积网络(FCN)的U型网络(U-Net)作为骨干网,但卷积神经网络(CNN)在捕捉长距离依赖能力上的劣势限制了分割模型性能的进一步提升。针对上述问题,研究者们将Transformer应用到医学图像分割模型中以弥补CNN的不足,结合Transformer和U型结构的分割网络成为研究热点之一。在详细介绍U-Net和Transformer之后,按医学图像分割模型中Transformer模块所处的位置,包括仅在编码器或解码器、同时在编码器和解码器、作为过渡连接和其他位置进行分类,讨论各模型的基本内容、设计理念以及可改进的地方,并分析了Transformer处于不同位置的优缺点。根据分析结果可知,决定Transformer所在位置的最大因素是目标分割任务的特点,而且Transformer结合U-Net的分割模型能更好地利用CNN和Transformer各自的优势,提高模型的分割性能,具有较大的发展前景和研究价值。

关键词: 深度学习, 卷积神经网络, 医学图像分割, U型网络, Transformer

CLC Number: