Journal of Computer Applications ›› 2026, Vol. 46 ›› Issue (5): 1692-1702.DOI: 10.11772/j.issn.1001-9081.2025050645
• Frontier and comprehensive applications • Previous Articles
Jing HU1, Shikun CHEN1(
), Fang WANG1, Rui ZHANG1, Yong WANG2
Received:2025-06-12
Revised:2025-08-10
Accepted:2025-09-09
Online:2025-09-25
Published:2026-05-10
Contact:
Shikun CHEN
About author:HU Jing, born in 1977, Ph. D., professor. Her research interests include image processing, deep learning.Supported by:通讯作者:
陈世堃
作者简介:胡静(1977—),女,山西太原人,教授,博士,CCF高级会员,主要研究方向:图像处理、深度学习基金资助:CLC Number:
Jing HU, Shikun CHEN, Fang WANG, Rui ZHANG, Yong WANG. Ore image segmentation with linear deformable convolution and dual-domain synergistic dynamic attention[J]. Journal of Computer Applications, 2026, 46(5): 1692-1702.
胡静, 陈世堃, 王芳, 张睿, 王勇. 基于线性可变形卷积与双域协同动态注意力的矿石图像分割[J]. 《计算机应用》唯一官方网站, 2026, 46(5): 1692-1702.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2025050645
| 模型 | Dice/% | HD95/mm | mIoU/% | P/% | R/% | PA/% |
|---|---|---|---|---|---|---|
| U-net | 85.20 | 23.11 | 72.50 | 82.52 | 89.63 | 90.55 |
| Swin-Unet | 88.83 | 20.13 | 76.50 | 86.50 | 91.20 | 92.85 |
| TransUnet | 87.95 | 21.50 | 73.30 | 82.30 | 91.11 | 92.40 |
| TransFuse | 79.61 | 25.30 | 67.86 | 72.57 | 88.22 | 89.50 |
| MALUNet | 87.96 | 21.07 | 78.69 | 85.07 | 91.30 | 92.95 |
| VM-Unet | 90.06 | 19.00 | 82.59 | 89.86 | 92.29 | 93.60 |
| ViT-CoMer | 89.85 | 19.92 | 83.65 | 89.50 | 90.20 | 93.75 |
| LDDA-Net | 91.54 | 16.84 | 85.13 | 90.01 | 91.09 | 94.10 |
Tab. 1 Performance comparison of different models
| 模型 | Dice/% | HD95/mm | mIoU/% | P/% | R/% | PA/% |
|---|---|---|---|---|---|---|
| U-net | 85.20 | 23.11 | 72.50 | 82.52 | 89.63 | 90.55 |
| Swin-Unet | 88.83 | 20.13 | 76.50 | 86.50 | 91.20 | 92.85 |
| TransUnet | 87.95 | 21.50 | 73.30 | 82.30 | 91.11 | 92.40 |
| TransFuse | 79.61 | 25.30 | 67.86 | 72.57 | 88.22 | 89.50 |
| MALUNet | 87.96 | 21.07 | 78.69 | 85.07 | 91.30 | 92.95 |
| VM-Unet | 90.06 | 19.00 | 82.59 | 89.86 | 92.29 | 93.60 |
| ViT-CoMer | 89.85 | 19.92 | 83.65 | 89.50 | 90.20 | 93.75 |
| LDDA-Net | 91.54 | 16.84 | 85.13 | 90.01 | 91.09 | 94.10 |
| BCE | Boundary | DPAG | Dice/% | HD95/mm | mIoU/% | PA/% |
|---|---|---|---|---|---|---|
| √ | 82.23 | 24.45 | 77.65 | 88.15 | ||
| √ | 81.85 | 23.92 | 76.50 | 87.80 | ||
| √ | √ | 83.63 | 22.02 | 78.23 | 89.30 | |
| √ | √ | √ | 85.81 | 20.15 | 80.74 | 90.85 |
Tab. 2 Ablation experimental results of loss functions
| BCE | Boundary | DPAG | Dice/% | HD95/mm | mIoU/% | PA/% |
|---|---|---|---|---|---|---|
| √ | 82.23 | 24.45 | 77.65 | 88.15 | ||
| √ | 81.85 | 23.92 | 76.50 | 87.80 | ||
| √ | √ | 83.63 | 22.02 | 78.23 | 89.30 | |
| √ | √ | √ | 85.81 | 20.15 | 80.74 | 90.85 |
| 方法 | Dice/% | H/% | ImIoU/% | Params/106 | GFLOPS |
|---|---|---|---|---|---|
| Conv=3×3, N=9 | 85.81 | 20.15 | 80.74 | 11.89 | 7.81 |
| LDConv, N=5 | 83.20 | 23.76 | 77.23 | 9.49 | 6.26 |
| LDConv, N=7 | 84.24 | 21.25 | 78.99 | 10.94 | 7.16 |
| LDConv, N=9 | 88.75 | 18.75 | 83.42 | 12.38 | 8.10 |
| DefConv=3×3, N=9 | 88.03 | 19.20 | 82.20 | 13.92 | 9.02 |
| LDConv, N=11 | 90.12 | 18.94 | 84.12 | 15.83 | 11.63 |
Tab. 3 Performance comparison of different sampling point numbers N
| 方法 | Dice/% | H/% | ImIoU/% | Params/106 | GFLOPS |
|---|---|---|---|---|---|
| Conv=3×3, N=9 | 85.81 | 20.15 | 80.74 | 11.89 | 7.81 |
| LDConv, N=5 | 83.20 | 23.76 | 77.23 | 9.49 | 6.26 |
| LDConv, N=7 | 84.24 | 21.25 | 78.99 | 10.94 | 7.16 |
| LDConv, N=9 | 88.75 | 18.75 | 83.42 | 12.38 | 8.10 |
| DefConv=3×3, N=9 | 88.03 | 19.20 | 82.20 | 13.92 | 9.02 |
| LDConv, N=11 | 90.12 | 18.94 | 84.12 | 15.83 | 11.63 |
| 模型 | 编码器 | 模块 | n_Skip | Dice/% | HD95/mm | mIoU/% | PA/% | ||
|---|---|---|---|---|---|---|---|---|---|
| ResNetV2 | Transformer | LDConv | DAM | ||||||
| M0 | √ | 3 | 83.52 | 22.58 | 78.15 | 87.95 | |||
| M1 | √ | √ | 3 | 85.81 | 20.15 | 80.74 | 90.85 | ||
| M2 | √ | √ | √ | 3 | 88.75 | 18.75 | 83.42 | 92.12 | |
| M3 | √ | √ | √ | 3 | 87.21 | 17.99 | 82.20 | 91.50 | |
| M4 | √ | √ | √ | 3 | 89.55 | 18.10 | 84.05 | 92.80 | |
| M5 | √ | √ | √ | √ | 1 | 89.80 | 18.20 | 83.90 | 90.84 |
| M6 | √ | √ | √ | √ | 2 | 89.46 | 17.85 | 83.88 | 92.30 |
| M7 | √ | √ | √ | √ | 3 | 91.54 | 16.84 | 85.13 | 94.10 |
Tab. 4 Experimental results of module ablation
| 模型 | 编码器 | 模块 | n_Skip | Dice/% | HD95/mm | mIoU/% | PA/% | ||
|---|---|---|---|---|---|---|---|---|---|
| ResNetV2 | Transformer | LDConv | DAM | ||||||
| M0 | √ | 3 | 83.52 | 22.58 | 78.15 | 87.95 | |||
| M1 | √ | √ | 3 | 85.81 | 20.15 | 80.74 | 90.85 | ||
| M2 | √ | √ | √ | 3 | 88.75 | 18.75 | 83.42 | 92.12 | |
| M3 | √ | √ | √ | 3 | 87.21 | 17.99 | 82.20 | 91.50 | |
| M4 | √ | √ | √ | 3 | 89.55 | 18.10 | 84.05 | 92.80 | |
| M5 | √ | √ | √ | √ | 1 | 89.80 | 18.20 | 83.90 | 90.84 |
| M6 | √ | √ | √ | √ | 2 | 89.46 | 17.85 | 83.88 | 92.30 |
| M7 | √ | √ | √ | √ | 3 | 91.54 | 16.84 | 85.13 | 94.10 |
| 模型 | Dice/% | HD95/mm | mIoU/% | PA/% |
|---|---|---|---|---|
| LDDA-Net | 88.85 | 17.37 | 86.67 | 93.30 |
| U-net | 80.07 | 23.51 | 77.05 | 88.74 |
| HiFormer | 84.13 | 19.95 | 82.58 | 90.05 |
| Att U-net | 83.22 | 20.84 | 80.36 | 89.82 |
| HRNet | 82.44 | 21.23 | 80.44 | 89.72 |
| SegFormer | 85.21 | 18.26 | 83.94 | 91.41 |
| Mask2Former | 86.52 | 17.25 | 83.29 | 90.16 |
Tab. 5 Performance comparison of different models on Ore dataset
| 模型 | Dice/% | HD95/mm | mIoU/% | PA/% |
|---|---|---|---|---|
| LDDA-Net | 88.85 | 17.37 | 86.67 | 93.30 |
| U-net | 80.07 | 23.51 | 77.05 | 88.74 |
| HiFormer | 84.13 | 19.95 | 82.58 | 90.05 |
| Att U-net | 83.22 | 20.84 | 80.36 | 89.82 |
| HRNet | 82.44 | 21.23 | 80.44 | 89.72 |
| SegFormer | 85.21 | 18.26 | 83.94 | 91.41 |
| Mask2Former | 86.52 | 17.25 | 83.29 | 90.16 |
| [1] | 徐述腾,周永章. 基于深度学习的镜下矿石矿物的智能识别实验研究[J]. 岩石学报, 2018, 34(11): 3244-3252. |
| XU S T, ZHOU Y Z. Artificial intelligence identification of ore minerals under microscope based on deep learning algorithm[J]. Acta Petrologica Sinica, 2018, 34(11): 3244-3252. | |
| [2] | ZHAN Y, ZHANG G. An improved OTSU algorithm using histogram accumulation moment for ore segmentation[J]. Symmetry, 2019, 11(3): No.431. |
| [3] | ANDERSSON T, THURLEY M J, CARLSON J E. A machine vision system for estimation of size distributions by weight of limestone particles[J]. Minerals Engineering, 2012, 25(1): 38-46. |
| [4] | RONNEBERGER O, FISCHER P, BROX T. U-net: convolutional networks for biomedical image segmentation[C]// Proceedings of the 2015 International Conference on Medical image computing and Computer-Assisted Intervention, LNCS 9351. Cham: Springer, 2015: 234-241. |
| [5] | DUAN J, LIU X. Online monitoring of green pellet size distribution in haze-degraded images based on VGG16-LU-Net and haze judgment[J]. IEEE Transactions on Instrumentation and Measurement, 2021, 70: No.5006316. |
| [6] | WANG W, LI Q, XIAO C, et al. An improved boundary-aware U‑Net for ore image semantic segmentation[J]. Sensors, 2021, 21(8): No.2615. |
| [7] | FILIPPO M P, GOMES O D F M, COSTA G A O P DA, et al. Deep learning semantic segmentation of opaque and non-opaque minerals from epoxy resin in reflected light microscopy images[J]. Minerals Engineering, 2021, 170: No.107007. |
| [8] | DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: Transformers for image recognition at scale[EB/OL]. [2025-06-03].. |
| [9] | LIU Z, LIN Y, CAO Y, et al. Swin Transformer: hierarchical Vision Transformer using shifted windows[C]// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 9992-10002. |
| [10] | ZHANG Y, LIU H, HU Q. TransFuse: fusing Transformers and CNNs for medical image segmentation[C]// Proceedings of the 2021 International Conference on Medical Image Computing and Computer-Assisted Intervention, LNCS 12901. Cham: Springer, 2021: 14-24. |
| [11] | XIA C, WANG X, LV F, et al. ViT-CoMer: Vision Transformer with convolutional multi-scale feature interaction for dense predictions[C]// Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2024: 5493-5502. |
| [12] | HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 770-778. |
| [13] | LIU X, ZHANG Y, JING H, et al. Ore image segmentation method using U-Net and Res_Unet convolutional networks[J]. RSC Advances, 2020, 10(16): 9396-9406. |
| [14] | LI F, LIU X, YIN Y, et al. DDR-Unet: a high-accuracy and efficient ore image segmentation method[J]. IEEE Transactions on Instrumentation and Measurement, 2023, 72: No.5027920. |
| [15] | XIAO D, LIU X, LE B T, et al. An ore image segmentation method based on RDU-Net model[J]. Sensors, 2020, 20(17): No.4979. |
| [16] | YANG H, HUANG C, WANG L, et al. An improved encoder-decoder network for ore image segmentation[J]. IEEE Sensors Journal, 2021, 21(10): 11469-11475. |
| [17] | CHEN J, LU Y, YU Q, et al. TransUNet: Transformers make strong encoders for medical image segmentation[EB/OL]. [2021-02-08].. |
| [18] | WANG B, WANG F, DONG P, et al. Multiscale TransUNet++: dense hybrid U-Net with Transformer for medical image segmentation[J]. Signal, Image and Video Processing, 2022, 16(6): 1607-1614. |
| [19] | HEIDARI M, KAZEROUNI A, SOLTANY M, et al. HiFormer: hierarchical multi-scale representations using Transformers for medical image segmentation[C]// Proceedings of the 2023 IEEE/CVF Winter Conference on Applications of Computer Vision. Piscataway: IEEE, 2023: 6191-6201. |
| [20] | 郇宝乾,宋家威,张万忠,等. 基于TAUNet分割模型的爆堆块度空间分布研究[J]. 矿业研究与开发, 2024,44(5): 37-44. |
| HUAN B Q, SONG J W, ZHANG W Z, et al. Spatial distribution of blast reactor block based on TAUNet segmentation model[J]. Mining Research and Development, 2024, 44(5): 37-44. | |
| [21] | DAI J, QI H, XIONG Y, et al. Deformable convolutional networks[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 764-773. |
| [22] | ZHANG X, SONG Y, SONG T, et al. LDConv: linear deformable convolution for improving convolutional neural networks[J]. Image and Vision Computing, 2024, 149: No.105190. |
| [23] | ZHANG Y, CHENG L, PENG Y, et al. Faster OreFSDet: a lightweight and effective few-shot object detector for ore images[J]. Pattern Recognition, 2023, 141: No.109664. |
| [24] | CAO H, WANG Y, CHEN J, et al. Swin-Unet: Unet-like pure Transformer for medical image segmentation[C]// Proceedings of the 2022 European Conference on Computer Vision Workshops, LNCS 13803. Cham: Springer, 2023: 205-218. |
| [25] | RUAN J, XIANG S, XIE M, et al. MALUNet: a multi-attention and light-weight UNet for skin lesion segmentation[C]// Proceedings of the 2022 IEEE International Conference on Bioinformatics and Biomedicine. Piscataway: IEEE, 2022: 1150-1156. |
| [26] | RUAN J, LI J, XIANG S. VM-UNet: vision Mamba UNet for medical image segmentation[EB/OL]. [2024-11-08].. |
| [27] | OKTAY O, SCHLEMPER J, LE FOLGOC L, et al. Attention U-Net: learning where to look for the pancreas[EB/OL]. [2025-04-20].. |
| [28] | WANG J, SUN K, CHENG T, et al. Deep high-resolution representation learning for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(10): 3349-3364. |
| [29] | XIE E, WANG W, YU Z, et al. SegFormer: simple and efficient design for semantic segmentation with Transformers[C]// Proceedings of the 35th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2021: 12077-12090. |
| [30] | CHENG B, MISRA I, SCHWING A G, et al. Masked-attention mask transformer for universal image segmentation[C]// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 1280-1289. |
| [1] | Huijie GUO, Tianfeng DOU, Zhenlin ZHANG, Kaiyuan QI, Dong WU, Zhijian QU, Zhao LI, Chongguang REN. Time-interdependency-aware dynamic Bayesian network for traffic prediction [J]. Journal of Computer Applications, 2026, 46(5): 1507-1517. |
| [2] | Wen PENG, Bokai ZHANG, Jinwei LIN. Chromosome cascaded classification framework integrating image texture enhancement and super-resolution [J]. Journal of Computer Applications, 2026, 46(5): 1647-1657. |
| [3] | Qianfei WANG, Yang LI, Deyu LI, Suge WANG. Dual-channel feature fusion representation method for short-text clustering based on large language model [J]. Journal of Computer Applications, 2026, 46(5): 1441-1449. |
| [4] | Ruirui SONG, Leichun WANG, Yunping HE, Jinxiang WEI, Xiangfeng LU, Xiaomeng LIU. Long time series prediction based on hybrid self-attention and differentiated normalization [J]. Journal of Computer Applications, 2026, 46(5): 1499-1506. |
| [5] | Ying JING, Ran LI, Zhuo JIANG, Ziyang FU, Jingyi DU, Qi LIU, Jihang LIU. SAM Meibomian gland unified dense segmentation method with introduction of automatic prompt encoder [J]. Journal of Computer Applications, 2026, 46(5): 1667-1676. |
| [6] | Baoyuan ZHENG, Chaobo HE. Graph convolutional network enhanced by graph diffusion and dual-view feature learning [J]. Journal of Computer Applications, 2026, 46(5): 1370-1377. |
| [7] | Hongrui ZHANG, Weiming FENG, Luxia YANG, Yongjie MA. CSAF-YOLO: improved YOLO11 algorithm for underwater small object detection [J]. Journal of Computer Applications, 2026, 46(5): 1578-1585. |
| [8] | Chuandong QIN, Zhiqiang SUO. Skin cancer classification integrating improved ResNet50 with ensemble classifier [J]. Journal of Computer Applications, 2026, 46(4): 1354-1362. |
| [9] | Huanxian LIU, Hongtao WANG, Xian’ao WANG, Hongmei WANG, Weifeng XU. Multimodal fact verification with cross-modal semantic association [J]. Journal of Computer Applications, 2026, 46(4): 1069-1076. |
| [10] | Xumeng DOU, Bin XIE, Zhaohui ZHANG, Zhengang ZHAO, Hanyu DUAN, Aolei GUO. Drug-target interaction prediction based on structure-network collaborative features and grid-attention enhanced Kolmogorov-Arnold network [J]. Journal of Computer Applications, 2026, 46(4): 1344-1353. |
| [11] | Xiang BAI, Juchuan LI, Huimin WANG, Chao JING, Jian NIU, Xingzhong ZHANG, Yongqiang CHENG. Power image retrieval method based on improved Swin Transformer [J]. Journal of Computer Applications, 2026, 46(4): 1334-1343. |
| [12] | Peirong SHAO, Suzhen LIN, Yanbo WANG. Human-centric detail-enhanced virtual try-on method [J]. Journal of Computer Applications, 2026, 46(3): 915-923. |
| [13] | Zuxi ZHANG, Zhancheng ZHANG, Fuyuan HU. Local and long-range temporal complementary modeling for video action recognition [J]. Journal of Computer Applications, 2026, 46(3): 758-766. |
| [14] | Ming LI, Mengqi WANG, Aili ZHANG, Hua REN, Yuqiang DOU. Image steganography method based on conditional generative adversarial networks and hybrid attention mechanism [J]. Journal of Computer Applications, 2026, 46(2): 475-484. |
| [15] | Sizhong ZHANG, Jianyang LIU, Linfeng LI. Action quality assessment model based on trajectory-guided perceptual learning with X3D [J]. Journal of Computer Applications, 2026, 46(2): 555-563. |
| Viewed | ||||||
|
Full text |
|
|||||
|
Abstract |
|
|||||