Journal of Computer Applications ›› 2024, Vol. 44 ›› Issue (12): 3915-3921.DOI: 10.11772/j.issn.1001-9081.2023121828

• Multimedia computing and computer simulation • Previous Articles     Next Articles

Parallel medical image registration model based on convolutional neural network and Transformer

Xin ZHAO(), Xinjie LI, Jian XU, Buyun LIU, Xiang BI   

  1. School of Information Engineering,Dalian University,Dalian Liaoning 116622,China
  • Received:2024-01-02 Revised:2024-04-02 Accepted:2024-04-07 Online:2024-04-19 Published:2024-12-10
  • Contact: Xin ZHAO
  • About author:LI Xinjie, born in 1999, M. S. candidate. His research interests include deep learning, medical image registration, computer vision.
    XU Jian, born in 1999, M. S. candidate. His research interests include medical image processing, deep learning model lightweighting.
    LIU Buyun, born in 1999, M. S. candidate. Her research interests include medical image processing.
    BI Xiang, born in 2002, M. S. candidate. His research interests include semi-supervised medical image segmentation.
  • Supported by:
    National Natural Science Foundation of China(61971424)

基于卷积神经网络与Transformer并行的医学图像配准模型

赵欣(), 李鑫杰, 徐健, 刘步云, 毕祥   

  1. 大连大学 信息工程学院,辽宁 大连 116622
  • 通讯作者: 赵欣
  • 作者简介:李鑫杰(1999—),男,河南驻马店人,硕士研究生,主要研究方向:深度学习、医学图像配准、计算机视觉
    徐健(1999—),男,河北廊坊人,硕士研究生,主要研究方向:医学图像处理、深度学习模型轻量化
    刘步云(1999—),女,山东聊城人,硕士研究生,主要研究方向:医学图像处理
    毕祥(2002—),男,河南驻马店人,硕士研究生,主要研究方向:半监督医学图像分割。
  • 基金资助:
    国家自然科学基金资助项目(61971424)

Abstract:

Medical image registration models aim to establish the correspondence of anatomical positions between images. The traditional image registration method obtains the deformation field through continuous iteration, which is time-consuming and has low accuracy. The deep neural networks not only achieve end-to-end generation of deformation fields, thereby speeding up the generation of deformation fields, but also further improve the accuracy of image registration. However, all of the current deep learning registration models use single Convolutional Neural Network (CNN) or Transformer architecture, and have the problems such as the inability to fully utilize the advantages of the combination of CNN and Transformer, resulting in insufficient registration accuracy, and the inability to maintain the original topology effectively after image registration. To solve these problems, a parallel medical image registration model based on CNN and Transformer — PPCTNet (Parallel Processing of CNN and Transformer Network) was proposed. Firstly, the model was constructed using Swin Transformer, which currently has the excellent registration accuracy, and LOCV-Net (Lightweight attentiOn-based ConVolutional Network), a very lightweight CNN. Then, the feature information extracted by Swin Transformer and LOCV-Net were fully integrated by designing a fusion strategy, so that the model not only had the local feature extraction capability of CNN and the long-distance dependency capability of Transformer, but also had the advantage of being lightweight. Finally, based on the brain Magnetic Resonance Imaging (MRI) dataset, PPCTNet was compared with 10 classical image alignment models. The results show that compared to the currently excellent registration model TransMorph (hybrid Transformer-ConvNet network for image registration), PPCTNet has the highest registration accuracy 0.5 percentage points higher, and the folding rate of deformation field 1.56 percentage points reduced, maintaining the topological structures of the registered images. Besides, compared with TransMorph, PPCTNet has the parameters reduced by 10.39×106, and the computational cost reduced by 278×109, which reflects the lightweight advantage of PPCTNet.

Key words: medical image, image registration, Convolutional Neural Network (CNN), Transformer architecture, lightweight convolution

摘要:

医学图像配准模型旨在建立图像间解剖位置的对应关系。传统的图像配准方法通过不断迭代获取形变场,耗费时间长且精度不高。深度神经网络不仅实现了端到端的形变场生成,加快了形变场的生成,而且进一步提升了图像配准的精度。针对目前的深度学习配准模型均采用单一的卷积神经网络(CNN)或Transformer架构,无法充分发挥CNN与Transformer结合的优势导致配准精度不足,以及图像配准后无法有效保持原始拓扑结构等问题,提出一种基于CNN与Transformer并行的医学图像配准模型PPCTNet(Parallel Processing of CNN and Transformer Network)。首先,选用目前配准精度优秀的Swin Transformer和极轻量化的CNN——LOCV-Net(Lightweight attentiOn-based ConVolutional Network)构建模型;其次,设计融合策略充分融合Swin Transformer与LOCV-Net提取的特征信息,使模型不仅拥有CNN的局部特征提取能力和Transformer的长距离依赖能力,还兼具轻量化的优势;最后,基于脑部磁共振成像(MRI)数据集,比较PPCTNet与10种经典图像配准模型。结果表明,相较于目前优秀的配准模型TransMorph (hybrid Transformer-ConvNet network for image registration),PPCTNet的最高配准精度提高了0.5个百分点,且形变场的折叠率下降了1.56个百分点,维持了配准图像的拓扑结构。此外,PPCTNet的参数量比TransMorph下降了10.39×106,计算量下降了278×109,体现了PPCTNet的轻量化优势。

关键词: 医学图像, 图像配准, 卷积神经网络, Transformer架构, 轻量化卷积

CLC Number: