《计算机应用》唯一官方网站

• •    下一篇

线性可变形卷积与双域协同动态注意力的矿石图像分割

胡静1,陈世堃2,王芳1,张睿1,王勇3   

  1. 1. 太原科技大学
    2. 太原科技大学计算机科学与技术学院
    3. 山西焦煤民爆集团矿山民爆工程分公司
  • 收稿日期:2025-06-12 修回日期:2025-08-10 接受日期:2025-09-09 发布日期:2025-09-25 出版日期:2025-09-25
  • 通讯作者: 陈世堃
  • 基金资助:
    山西省自然科学基金项目;山西省自然科学基金项目;企业委托横向项目

Ore image segmentation with linear deformable convolution and dual-domain synergistic dynamic attention

  • Received:2025-06-12 Revised:2025-08-10 Accepted:2025-09-09 Online:2025-09-25 Published:2025-09-25

摘要: 针对矿石图像在分割任务中因纹理复杂、形态不规则以及光照不均导致的边界模糊与精度不足问题,提出了一种线性可变形卷积与双域协同动态注意力的分割网络—LDDA-Net。网络采用编码器与解码器架构,在串行双重特征编码器中通过线性可变形卷积(LDConv)构建自适应采样点分布,灵活拟合矿石的不规则形态,并且凭借其线性特性有效控制了计算开销;其次,针对空间域特征设计了动态注意力调制模块(DAM Module),该模块通过池化采样、可学习注意力矩阵和边界敏感权重分配机制,实现特征图中关键信息与矿石边缘的动态聚焦和强化;最后,提出了一种新的动态渐进式注意力引导损失函数(DPAG Loss),该损失函数通过多阶段动态生成注意力图,引导模型在训练过程中聚焦模糊边界与小颗粒矿石等难分割区域,并与DAM模块形成空间-损失双域协同,构建了特征感知与学习策略的反馈闭环机制。在自建露天矿石数据集(OpenPitOre Dataset)与公开矿石数据集(Ore dataset)上的实验结果表明,LDDA-Net的HD95边界误差仅16.84mm,相较于次优模型VM-Unet降低了11.37%;Dice系数高达91.54%,mIoU与PA分别为85.13%和94.10%,均显著优于对比分割模型,充分验证了LDDA-Net在复杂场景下实现的高精度与精细化分割效果,为露天爆破矿石的智能检测与块度分析提供了可靠技术支撑。

关键词: 矿石图像分割, 语义分割, 线性可变形卷积, 注意力机制, 边界检测, 双域协同

Abstract: In order to solve the problems of boundary blurring and insufficient accuracy caused by complex texture, irregular shape and uneven illumination of ore images in the segmentation task, a segmentation network with linear deformable convolution and dual-domain synergistic dynamic attention—LDDA-Net was proposed. The network adopts the encoder and decoder architecture, and constructs an adaptive sampling point distribution through Linear Deformable Convolution (LDConv) in the serial dual-feature encoder, which flexibly fits the irregular shape of the ore, and effectively controls the computational overhead with its linear characteristics. Secondly, a dynamic attention modulation module (DAM Module) was designed for spatial domain features, which realized the dynamic focusing and reinforcement of key information in the feature map and the edge of the ore through pooled sampling, learnable attention matrix and boundary-sensitive weight allocation mechanism. Finally, a new dynamic progressive attention-guided loss function (DPAG Loss) is proposed, which guides the model to focus on hard-to-divide areas such as fuzzy boundaries and small particles of ore during the training process by dynamically generating attention maps in multiple stages, and forms a space-loss dual-domain synergy with the DAM module, and constructs a feedback closed-loop mechanism of feature perception and learning strategies. The experimental results on the self-built OpenPitOre dataset and the public ore dataset show that the HD95 boundary error of LDDA-Net is only 16.84mm, which is 11.37% lower than that of the suboptimal model VM-Unet. The Dice coefficient is as high as 91.54%, and the mIoU and PA are 85.13% and 94.10%, respectively, which are significantly better than the comparative segmentation models, which fully verifies the high-precision and fine segmentation effect achieved by LDDA-Net in complex scenes, and provides reliable technical support for the intelligent detection and blockiness analysis of open-pit blasting ore.

Key words: ore image segmentation, semantic segmentation, linear deformable convolution, attention mechanisms, boundary detection, Dual-domain synergy

中图分类号: