《计算机应用》唯一官方网站 ›› 2025, Vol. 45 ›› Issue (4): 1120-1129.DOI: 10.11772/j.issn.1001-9081.2024040415
姜坤元1, 李小霞1,2(), 王利3, 曹耀丹3, 张晓强1,2, 丁楠1, 周颖玥1,2
收稿日期:
2024-04-11
修回日期:
2024-06-26
接受日期:
2024-06-28
发布日期:
2025-04-08
出版日期:
2025-04-10
通讯作者:
李小霞
作者简介:
姜坤元(2000—),女,山东淄博人,硕士研究生,CCF会员,主要研究方向:模式识别、医学图像处理基金资助:
Kunyuan JIANG1, Xiaoxia LI1,2(), Li WANG3, Yaodan CAO3, Xiaoqiang ZHANG1,2, Nan DING1, Yingyue ZHOU1,2
Received:
2024-04-11
Revised:
2024-06-26
Accepted:
2024-06-28
Online:
2025-04-08
Published:
2025-04-10
Contact:
Xiaoxia LI
About author:
JIANG Kunyuan, born in 2000, M. S. candidate. Her research interests include pattern recognition, medical image processing.Supported by:
摘要:
针对内镜语义分割网络中病灶边缘信息丢失和大面积病灶分割不全的问题,提出一种引入解耦残差自注意力(DRA)的边界交叉监督语义分割网络(BCS-SegNet)。首先,引入DRA,以增强网络对远距离关联性病灶的学习能力;其次,构建跨级交叉融合(CLF)模块,从而将编码结构中的多级特征图逐对组合,进而实现在低计算成本下图像细节与语义信息的融合;最后,使用多方向多尺度的二维Gabor变换提取边缘信息,并使用空间注意力加权特征图中的边缘特征,以监督分割网络的解码过程,从而在像素级别上提供更精准的类内分割一致性。实验结果表明,在ISIC2018皮肤镜和Kvasir-SEG/CVC-ClinicDB结肠镜数据集上,BCS-SegNet的平均交并比(mIoU)和Dice系数分别为84.27%、90.68%和79.24%、87.91%;在自建食管内镜数据集上,BCS-SegNet的mIoU和Dice系数分别为82.73%和90.84%,mIoU相较于U-net和UCTransNet分别提升了3.30%和4.97%。可见,所提网络可以达到更完整的分割区域和更清晰的边缘细节等视觉效果。
中图分类号:
姜坤元, 李小霞, 王利, 曹耀丹, 张晓强, 丁楠, 周颖玥. 引入解耦残差自注意力的边界交叉监督语义分割网络[J]. 计算机应用, 2025, 45(4): 1120-1129.
Kunyuan JIANG, Xiaoxia LI, Li WANG, Yaodan CAO, Xiaoqiang ZHANG, Nan DING, Yingyue ZHOU. Boundary-cross supervised semantic segmentation network with decoupled residual self-attention[J]. Journal of Computer Applications, 2025, 45(4): 1120-1129.
图像类型 | 训练集样本数 | 验证集样本数 | 测试集样本数 | 总数 |
---|---|---|---|---|
皮肤镜图像 | 2 047 | 260 | 260 | 2 567 |
结肠镜图像 | 1 450 | 162 | 160 | 1 772 |
食管内镜图像 | 2 552 | 310 | 310 | 3 172 |
表1 各实验数据集样本数
Tab. 1 Number of each experimental dataset
图像类型 | 训练集样本数 | 验证集样本数 | 测试集样本数 | 总数 |
---|---|---|---|---|
皮肤镜图像 | 2 047 | 260 | 260 | 2 567 |
结肠镜图像 | 1 450 | 162 | 160 | 1 772 |
食管内镜图像 | 2 552 | 310 | 310 | 3 172 |
网络类型 | 网络 | mIoU/% | Dice/% | 计算量/GFLOPs | 参数量/106 | 帧率/(frame·s-1) |
---|---|---|---|---|---|---|
CNN | U-net | 80.09 | 87.19 | 226.15 | 24.89 | 20 |
DeepLabV3+ | 79.80 | 88.53 | 264.60 | 70.07 | 12 | |
U2-Net | 74.43 | 83.31 | 150.61 | 43.99 | 18 | |
EGE-UNet | 63.40 | 75.46 | 0.28 | 0.04 | 26 | |
Transformer+CNN | MedT | 68.55 | 76.92 | 70.89 | 10.80 | 24 |
TransUnet | 81.21 | 87.86 | 129.29 | 93.23 | 14 | |
UCTransNet | 78.81 | 86.90 | 172.01 | 66.24 | 12 | |
BCS-SegNet | 82.73 | 90.84 | 232.71 | 24.98 | 19 |
表2 不同网络在自建食管数据集上的结果对比
Tab. 2 Comparison of results of different networks on self-built esophageal endoscopy dataset
网络类型 | 网络 | mIoU/% | Dice/% | 计算量/GFLOPs | 参数量/106 | 帧率/(frame·s-1) |
---|---|---|---|---|---|---|
CNN | U-net | 80.09 | 87.19 | 226.15 | 24.89 | 20 |
DeepLabV3+ | 79.80 | 88.53 | 264.60 | 70.07 | 12 | |
U2-Net | 74.43 | 83.31 | 150.61 | 43.99 | 18 | |
EGE-UNet | 63.40 | 75.46 | 0.28 | 0.04 | 26 | |
Transformer+CNN | MedT | 68.55 | 76.92 | 70.89 | 10.80 | 24 |
TransUnet | 81.21 | 87.86 | 129.29 | 93.23 | 14 | |
UCTransNet | 78.81 | 86.90 | 172.01 | 66.24 | 12 | |
BCS-SegNet | 82.73 | 90.84 | 232.71 | 24.98 | 19 |
网络类型 | 方法 | ISIC2018 | Kvasir-SEG/CVC-ClinicDB | 计算量/GFLOPs | 参数量/106 | ||
---|---|---|---|---|---|---|---|
mIoU/% | Dice/% | mIoU/% | Dice/% | ||||
CNN | U-net | 77.28 | 87.15 | 78.48 | 87.63 | 226.15 | 24.89 |
UPerNet | 77.37 | 89.39 | 75.44 | 84.18 | 30.73 | 27.39 | |
UNeXt | 72.27 | 82.52 | 75.96 | 86.19 | 2.30 | 1.47 | |
EGE-UNet | 80.12 | 88.96 | 68.97 | 81.64 | 0.28 | 0.04 | |
Transformer+CNN | MedT | 71.19 | 82.41 | 76.79 | 84.32 | 70.89 | 10.80 |
BCS-SegNet | 84.27 | 90.68 | 79.24 | 87.91 | 232.70 | 24.98 |
表3 不同网络在公共数据集上的结果对比
Tab. 3 Comparison of results of different networks on public datasets
网络类型 | 方法 | ISIC2018 | Kvasir-SEG/CVC-ClinicDB | 计算量/GFLOPs | 参数量/106 | ||
---|---|---|---|---|---|---|---|
mIoU/% | Dice/% | mIoU/% | Dice/% | ||||
CNN | U-net | 77.28 | 87.15 | 78.48 | 87.63 | 226.15 | 24.89 |
UPerNet | 77.37 | 89.39 | 75.44 | 84.18 | 30.73 | 27.39 | |
UNeXt | 72.27 | 82.52 | 75.96 | 86.19 | 2.30 | 1.47 | |
EGE-UNet | 80.12 | 88.96 | 68.97 | 81.64 | 0.28 | 0.04 | |
Transformer+CNN | MedT | 71.19 | 82.41 | 76.79 | 84.32 | 70.89 | 10.80 |
BCS-SegNet | 84.27 | 90.68 | 79.24 | 87.91 | 232.70 | 24.98 |
实验编号 | 网络 | mIoU/% | Dice/% |
---|---|---|---|
1 | U-net | 80.09 | 87.19 |
2 | +DRA | 81.33 | 88.24 |
3 | +CLF | 80.61 | 88.12 |
4 | +BSD | 82.70 | 90.72 |
5 | +DRA、CLF | 81.52 | 88.31 |
6 | +DRA、CLF、BSD(BCS-SegNet) | 82.73 | 90.84 |
表4 各模块消融实验结果
Tab. 4 Ablation experimental results for each module
实验编号 | 网络 | mIoU/% | Dice/% |
---|---|---|---|
1 | U-net | 80.09 | 87.19 |
2 | +DRA | 81.33 | 88.24 |
3 | +CLF | 80.61 | 88.12 |
4 | +BSD | 82.70 | 90.72 |
5 | +DRA、CLF | 81.52 | 88.31 |
6 | +DRA、CLF、BSD(BCS-SegNet) | 82.73 | 90.84 |
f1 | f2 | f3 | f4 | f5 | mIoU/% | Dice/% | 计算量/GFLOPs | 参数量/106 |
---|---|---|---|---|---|---|---|---|
√ | 81.47 | 88.36 | 235.48 | 24.91 | ||||
√ | 81.33 | 88.24 | 232.30 | 24.98 | ||||
√ | 80.19 | 87.35 | 231.76 | 25.23 | ||||
√ | 80.16 | 86.96 | 231.42 | 26.21 | ||||
√ | 79.44 | 85.87 | 227.52 | 26.21 |
表5 解耦残差自注意力的消融实验结果
Tab. 5 Ablation experimental results of decoupled residual self-attention
f1 | f2 | f3 | f4 | f5 | mIoU/% | Dice/% | 计算量/GFLOPs | 参数量/106 |
---|---|---|---|---|---|---|---|---|
√ | 81.47 | 88.36 | 235.48 | 24.91 | ||||
√ | 81.33 | 88.24 | 232.30 | 24.98 | ||||
√ | 80.19 | 87.35 | 231.76 | 25.23 | ||||
√ | 80.16 | 86.96 | 231.42 | 26.21 | ||||
√ | 79.44 | 85.87 | 227.52 | 26.21 |
编号 | 融合策略 | mIoU/% | Dice/% |
---|---|---|---|
1 | 融合 | 80.61 | 88.12 |
2 | 融合 | 80.38 | 87.54 |
3 | 融合 | 79.43 | 86.39 |
表6 跨级交叉融合的消融实验结果
Tab. 6 Ablation experimental results of cross level fusion
编号 | 融合策略 | mIoU/% | Dice/% |
---|---|---|---|
1 | 融合 | 80.61 | 88.12 |
2 | 融合 | 80.38 | 87.54 |
3 | 融合 | 79.43 | 86.39 |
1 | 董育宁,刘天亮,戴修斌,等. 医学图像处理理论与应用[M]. 南京:东南大学出版社, 2020: 44-54. |
DONG Y N, LIU T L, DAI X B, et al. Medical image processing theory and applications[M]. Nanjing: Southeast University Press, 2020: 44-54. | |
2 | 窦猛,陈哲彬,王辛,等. 基于深度学习的多模态医学图像分割综述[J]. 计算机应用, 2023, 43(11):3385-3395. |
DOU M, CHEN Z B, WANG X, et al. Review of multi-modal medical image segmentation based on deep learning[J]. Journal of Computer Applications, 2023, 43(11):3385-3395. | |
3 | GONZALEZ R C, WOODS R E. 数字图像处理(第四版)[M]. 阮秋琦,阮宇智,译. 北京:电子工业出版社, 2020: 504-572. |
GONZALEZ R C, WOODS R E. Digital image processing, fourth edition[M]. RUAN Q Q, RUAN Y Z, translated. Beijing: Publishing House of Electronics Industry, 2020: 504-572. | |
4 | RONNEBERGER O, FISCHER P, BROX T. U-net: convolutional networks for biomedical image segmentation[C]// Proceedings of the 2015 International Conference on Medical Image Computing and Computer-Assisted Intervention, LNCS 9351. Cham: Springer, 2015: 234-241. |
5 | ASGARI TAGHANAKI S, ABHISHEK K, COHEN J P, et al. Deep semantic segmentation of natural and medical images: a review[J]. Artificial Intelligence Review, 2021, 54: 137-178. |
6 | CHEN L C, ZHU Y, PAPANDREOU G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation[C]// Proceedings of the 2018 European Conference on Computer Vision, LNCS 11211. Cham: Springer, 2018: 833-851. |
7 | XIAO T, LIU Y, ZHOU B, et al. Unified perceptual parsing for scene understanding[C]// Proceedings of 2018 European Conference on Computer Vision, LNCS 11209. Cham: Springer, 2018: 432-448. |
8 | QIN X B, ZHANG Z C, HUANG C Y, et al. U2-Net: going deeper with nested U-structure for salient object detection[J]. Pattern Recognition, 2020, 106: No.107404. |
9 | VALANARASE J M J, PATEL V M. UNeXt: MLP-based rapid medical image segmentation network[C]// Proceedings of the 2022 International Conference on Medical Image Computing and Computer-Assisted Intervention, LNCS 13435. Cham: Springer, 2022: 23-33. |
10 | D’ASCOLI S, TOUVRON H, LEAVITT M L, et al. ConViT: improving vision Transformers with soft convolutional inductive biases[C]// Proceedings of the 38th International Conference on Machine Learning. New York: JMLR.org, 2021: 2286-2296. |
11 | AZAD R, KAZEROUNI A, HEIDARI M, et al. Advances in medical image analysis with vision Transformers: a comprehensive review[J]. Medical Image Analysis, 2024, 91: No.103000. |
12 | CHEN J, LU Y, YU Q, et al. TransUNet: Transformers make strong encoders for medical image segmentation[EB/OL]. [2023-11-21].. |
13 | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2017: 6000-6010. |
14 | WANG H, CAO P, WANG J, et al. UCTransnet: rethinking the skip connections in U-Net from a channel-wise perspective with Transformer[C]// Proceedings of the 36th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2022: 2441-2449. |
15 | ZHANG Y, LIU H Y, HU Q. TransFuse: fusing Transformers and CNNs for medical image segmentation[C]// Proceedings of the 2021 International Conference on Medical Image Computing and Computer-Assisted Intervention, LNCS 12901. Cham: Springer, 2021: 14-24. |
16 | WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]// Proceedings of the 2018 European Conference on Computer Vision, LNCS 11211. Cham: Springer, 2018: 3-19. |
17 | HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 7132-7141. |
18 | VALANARASU J M J, OZA P, HACIHALILOGLU I, et al. Medical Transformer: gated axial-attention for medical image segmentation[C]// Proceedings of the 2021 International Conference on Medical Image Computing and Computer-Assisted Intervention, LNCS 12901. Cham: Springer, 2021: 36-46. |
19 | RUAN J, XIE M, GAO J, et al. EGE-UNet: an efficient group enhanced UNet for skin lesion segmentation[C]// Proceedings of the 2023 International Conference on Medical Image Computing and Computer-Assisted Intervention, LNCS 14223. Cham: Springer, 2023: 481-490. |
20 | SANDLER M, HOWARD A, ZHU M, et al. MobileNetV2: inverted residuals and linear bottlenecks[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 4510-4520. |
21 | KOKKINOS I. Pushing the boundaries of boundary detection using deep learning[EB/OL]. [2023-11-21].. |
22 | NONG Z, SU X, LIU Y, et al. Boundary-aware dual-stream network for VHR remote sensing images semantic segmentation[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2021, 14: 5260-5268. |
23 | CONG R, YANG H, JIANG Q, et al. BCS-Net: boundary, context, and semantic for automatic COVID-19 lung infection segmentation from CT images[J]. IEEE Transactions on Instrumentation and Measurement, 2022, 71: No.5019011. |
24 | CHEN F L, LIU H L, ZENG Z H, et al. BES-Net: boundary enhancing semantic context network for high-resolution image semantic segmentation[J]. Remote Sensing, 2022, 14(7): No.1638. |
25 | LIN Y, ZHANG D, FANG X, et al. Rethinking boundary detection in deep learning models for medical image segmentation[C]// Proceedings of the 2023 International Conference on Information Processing in Medical Imaging, LNCS 13939. Cham: Springer, 2023: 730-742. |
26 | 傅励瑶,尹梦晓,杨锋. 基于Transformer的U型医学图像分割网络综述[J]. 计算机应用, 2023, 43(5):1584-1595. |
FU L Y, YIN M X, YANG F. Transformer based U-shaped medical image segmentation network: a survey[J]. Journal of Computer Applications, 2023, 43(5):1584-1595. | |
27 | 朱希安,曹林. 小波分析及其在数字图像处理中的应用[M]. 北京:电子工业出版社, 2012: 163-169, 213-221. |
ZHU X A, CAO L. Wavelet analysis and its application in digital image processing[M]. Beijing: Publishing House of Electronics Industry, 2012: 163-169, 213-221. | |
28 | MILLETARI F, NAVAB N, AHMADI S A. V-Net: fully convolutional neural networks for volumetric medical image segmentation[C]// Proceedings of the 4th International Conference on 3D Vision. Piscataway: IEEE, 2016: 565-571. |
29 | CODELLA N, ROTEMBERG V, TSCHANDL P, et al. Skin lesion analysis toward melanoma detection 2018: a challenge hosted by the International Skin Imaging Collaboration (ISIC)[EB/OL]. [2023-12-02].. |
30 | JHA D, SMEDSRUD P H, RIEGLER M A, et al. Kvasir-SEG: a segmented polyp dataset[C]// Proceedings of the 2020 International Conference on Multimedia Modeling, LNCS 11962. Cham: Springer, 2020: 451-462. |
31 | TAJBAKHSH N, GURUDU S R, LIANG J. Automated polyp detection in colonoscopy videos using shape and context information[J]. IEEE Transactions on Medical Imaging, 2016, 35(2): 630-644. |
32 | KINGMA D P, BA J L. Adam: a method for stochastic optimization[EB/OL]. [2023-12-02].. |
[1] | 袁宝华, 陈佳璐, 王欢. 融合多尺度语义和双分支并行的医学图像分割网络[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 988-995. |
[2] | 宋鹏程, 郭立君, 张荣. 利用局部-全局时间依赖的弱监督视频异常检测[J]. 《计算机应用》唯一官方网站, 2025, 45(1): 240-246. |
[3] | 秦璟, 秦志光, 李发礼, 彭悦恒. 基于概率稀疏自注意力神经网络的重性抑郁疾患诊断[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2970-2974. |
[4] | 李力铤, 华蓓, 贺若舟, 徐况. 基于解耦注意力机制的多变量时序预测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2732-2738. |
[5] | 徐泽鑫, 杨磊, 李康顺. 较短的长序列时间序列预测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1824-1831. |
[6] | 刘越, 刘芳, 武奥运, 柴秋月, 王天笑. 基于自注意力机制与图卷积的3D目标检测网络[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1972-1977. |
[7] | 许立君, 黎辉, 刘祖阳, 陈侃松, 马为駽. 基于3D‑Ghost卷积神经网络的脑胶质瘤MRI图像分割算法3D‑GA‑Unet[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1294-1302. |
[8] | 黄荣, 宋俊杰, 周树波, 刘浩. 基于自监督视觉Transformer的图像美学质量评价方法[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1269-1276. |
[9] | 黄子麒, 胡建鹏. 实体类别增强的汽车领域嵌套命名实体识别[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 377-384. |
[10] | 罗歆然, 李天瑞, 贾真. 基于自注意力机制与词汇增强的中文医学命名实体识别[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 385-392. |
[11] | 顾聪, 段其强, 任思雨. 基于上下文感知网络的息肉分割算法[J]. 《计算机应用》唯一官方网站, 2024, 44(11): 3617-3622. |
[12] | 仇丽青, 苏小盼. 个性化多层兴趣提取点击率预测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(11): 3411-3418. |
[13] | 杨兴耀, 沈洪涛, 张祖莲, 于炯, 陈嘉颖, 王东晓. 基于层级过滤器和时间卷积增强自注意力网络的序列推荐[J]. 《计算机应用》唯一官方网站, 2024, 44(10): 3090-3096. |
[14] | 李言博, 何庆, 陆顺意. 融合语义和句法信息的方面情感三元组抽取[J]. 《计算机应用》唯一官方网站, 2024, 44(10): 3275-3280. |
[15] | 陈佳, 张鸿. 基于特征增强和语义相关性匹配的图像文本检索方法[J]. 《计算机应用》唯一官方网站, 2024, 44(1): 16-23. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||