Journal of Computer Applications ›› 2026, Vol. 46 ›› Issue (5): 1692-1702.DOI: 10.11772/j.issn.1001-9081.2025050645

• Frontier and comprehensive applications • Previous Articles    

Ore image segmentation with linear deformable convolution and dual-domain synergistic dynamic attention

Jing HU1, Shikun CHEN1(), Fang WANG1, Rui ZHANG1, Yong WANG2   

  1. 1.School of Computer Science and Technology,Taiyuan University of Science and Technology,Taiyuan Shanxi 030024,China
    2.Civil Explosives Engineering Branch of Shanxi Coking Coal Group Company Limited,Taiyuan Shanxi 030300,China
  • Received:2025-06-12 Revised:2025-08-10 Accepted:2025-09-09 Online:2025-09-25 Published:2026-05-10
  • Contact: Shikun CHEN
  • About author:HU Jing, born in 1977, Ph. D., professor. Her research interests include image processing, deep learning.
    WANG Fang, born in 1989, M. S., lecturer. Her research interests include medical image segmentation.
    ZHANG Rui, born in 1987, Ph. D., associate professor. His research interests include intelligent information processing.
    WANG Yong, born in 1984, engineer. His research interests include artificial intelligence, big data mining.
  • Supported by:
    Shanxi Provincial Natural Science Foundation(202203021211189);Enterprise Commissioned Horizontal Project(2021035)

基于线性可变形卷积与双域协同动态注意力的矿石图像分割

胡静1, 陈世堃1(), 王芳1, 张睿1, 王勇2   

  1. 1.太原科技大学 计算机科学与技术学院,太原 030024
    2.山西焦煤民爆集团矿山民爆工程分公司,太原 030300
  • 通讯作者: 陈世堃
  • 作者简介:胡静(1977—),女,山西太原人,教授,博士,CCF高级会员,主要研究方向:图像处理、深度学习
    王芳(1989—),女,山西太原人,讲师,硕士,主要研究方向:医学图像分割
    张睿(1987—),男,山西太原人,副教授,博士,CCF高级会员,主要研究方向:智能信息处理
    王勇(1984—),男,湖北武汉人,工程师,主要研究方向:人工智能、大数据挖掘。
  • 基金资助:
    山西省自然科学基金资助项目(202203021211189);山西省自然科学基金资助项目(202403021221142);企业委托横向项目(2021035)

Abstract:

In order to solve the problems of blurred boundaries and insufficient accuracy in ore image segmentation caused by complex texture, irregular shape and uneven illumination, a segmentation network with Linear Deformable Convolution (LDConv) and dual-domain synergistic dynamic attention was proposed, namely LDDA-Net (Linear Deformable Dual-domain Attention Network). LDDA-Net adopted an encoder-decoder architecture. In the serial dual-feature encoder, an adaptive sampling point distribution was constructed through LDConv to flexibly fit the irregular shapes of the ore, and effectively control the computational overhead with its linear characteristics. Secondly, a Dynamic Attention Modulation (DAM) module was designed for spatial domain features, which realized dynamic focusing and reinforcement of the key information in the feature map and the ore edge through pooling sampling, learnable attention matrix and boundary-sensitive weight allocation mechanism. Finally, a new Dynamic Progressive Attention Guided Loss function (DPAG Loss) was proposed, which guided the model to focus on hard-to-divide areas such as fuzzy boundaries and small-sized ore particles during the training process by dynamically generating attention maps in multiple stages, and a space-loss dual-domain synergy was formed by DPAG Loss and DAM module, creating a feedback closed-loop mechanism of feature perception and learning strategies. Experimental results on the self-built open-pit ore dataset (OpenPitOre dataset) and the public ore dataset (Ore dataset) showed that LDDA-Net achieved a HD95 boundary error of only 16.84 mm, which is 11.37% lower than that of the suboptimal model VM-Unet; it attained the Dice coefficient as high as 91.54%, the mIoU and PA of 85.13% and 94.10%, respectively, significantly outperforming comparative segmentation models. LDDA-Net achieves high-precision and refined segmentation in complex scenarios, providing reliable technical support for intelligent detection and fragmentation analysis of ore in open-pit blasting.

Key words: ore image segmentation, semantic segmentation, Linear Deformable Convolution (LDConv), attention mechanism, boundary detection, dual-domain synergy

摘要:

针对矿石图像在分割任务中因纹理复杂、形态不规则以及光照不均导致的边界模糊与精度不足问题,提出一种基于线性可变形卷积(LDConv)与双域协同动态注意力的分割网络LDDA-Net(Linear Deformable Dual-domain Attention Network)。该网络采用编码器与解码器架构,在串行双重特征编码器中通过LDConv构建自适应采样点分布,灵活拟合矿石的不规则形态,并且凭借LDConv的线性特性有效控制计算开销;其次,针对空间域特征设计动态注意力调制(DAM)模块,通过池化采样、可学习注意力矩阵和边界敏感权重分配机制实现特征图中关键信息与矿石边缘的动态聚焦和强化;最后,提出一种新的动态渐进式注意力引导损失函数(DPAG Loss),通过多阶段动态生成注意力图,引导模型在训练过程中聚焦模糊边界与小颗粒矿石等难分割区域,并与DAM模块形成空间-损失双域协同,构建特征感知与学习策略的反馈闭环机制。在自建露天矿石数据集(OpenPitOre Dataset)与公开矿石数据集(Ore dataset)上的实验结果表明,LDDA-Net的豪斯多夫-95距离(HD95)边界误差仅16.84 mm,相较于次优模型VM-Unet降低了11.37%;Dice系数高达91.54%,平均交并比(mIoU)和像素准确率(PA)分别为85.13%和94.10%,均显著优于对比分割模型。LDDA-Net在复杂场景下可实现高精度与精细化分割效果,可为露天爆破矿石的智能检测与块度分析提供可靠的技术支撑。

关键词: 矿石图像分割, 语义分割, 线性可变形卷积, 注意力机制, 边界检测, 双域协同

CLC Number: