《计算机应用》唯一官方网站 ›› 2025, Vol. 45 ›› Issue (1): 261-274.DOI: 10.11772/j.issn.1001-9081.2023121776

• 多媒体计算与计算机仿真 • 上一篇    下一篇

可变形卷积网络的解释性研究及其在蝴蝶物种识别模型中的应用

王璐1, 刘东1(), 刘卫光2   

  1. 1.中原工学院 计算机学院,郑州 450007
    2.中原工学院 软件学院,郑州 450007
  • 收稿日期:2023-12-27 修回日期:2024-05-07 接受日期:2024-05-08 发布日期:2024-05-21 出版日期:2025-01-10
  • 通讯作者: 刘东
  • 作者简介:王璐(1972—),男,辽宁抚顺人,副教授,博士,主要研究方向:机器视觉、人工智能、并行算法;
    刘卫光(1966—),男,河南新乡人,教授,博士,主要研究方向:计算机视觉、人工智能、信息安全。

Interpretability study on deformable convolutional network and its application in butterfly species recognition models

Lu WANG1, Dong LIU1(), Weiguang LIU2   

  1. 1.School of Computer Science,Zhongyuan University of Technology,Zhengzhou Henan 450007,China
    2.Software College,Zhongyuan University of Technology,Zhengzhou Henan 450007,China
  • Received:2023-12-27 Revised:2024-05-07 Accepted:2024-05-08 Online:2024-05-21 Published:2025-01-10
  • Contact: Dong LIU
  • About author:WANG Lu, born in 1972, Ph. D., associate professor. His research interests include machine vision, artificial intelligence, parallel algorithm.
    LIU Weiguang, born in 1966, Ph. D., professor. His research interests include computer vision, artificial intelligence, information security.

摘要:

近年来,可变形卷积网络(DCN)广泛运用于图像识别和分类等领域,然而对该模型的可解释性研究较为有限,它的适用性缺乏充分理论支持。针对上述问题,提出DCN的解释性研究及其在蝴蝶物种识别模型中的应用。首先,引入可变形卷积对VGG16、ResNet50和DenseNet121 (Dense Convolutional Network121)分类模型进行改进;其次,采用反卷积和类激活映射(CAM)等可视化手段来对比可变形卷积和标准卷积在特征提取能力上的差异,且通过消融实验结果表明可变形卷积在神经网络的较低层且不连续使用时效果更佳;再次,提出显著性移除(SR)并对CAM的性能和激活特征重要性进行统一评价,同时通过设置不同的移除阈值等多个角度,提高评价的客观性;最后,基于评价结果更高的FullGrad (Full Gradient-weighted)解释模型识别的判断依据。实验结果显示,在Archive_80数据集上,所提出的D_v2-DenseNet121的准确率达到97.03%,相较于DenseNet121分类模型提高了2.82个百分点。可见,可变形卷积的引入赋予了神经网络模型不变性特征提取能力,并提高了分类模型的准确率。

关键词: 可变形卷积网络, 可解释性, 蝴蝶物种识别, 类激活映射, 显著性移除

Abstract:

In recent years, Deformable Convolutional Network (DCN) has been widely applied in fields such as image recognition and classification. However, research on the interpretability of this model is relatively limited, and its applicability lacks sufficient theoretical support. To address these issues, this paper proposed an interpretability study of DCN and its application in butterfly species recognition model. Firstly, deformable convolution was introduced to improve the VGG16, ResNet50, and DenseNet121 (Dense Convolutional Network121) classification models. Secondly, visualization methods such as deconvolution and Class Activation Mapping (CAM) were used to compare the feature extraction capabilities of deformable convolution and standard convolution. The results of ablation experiments show that deformable convolution performs better when used in the lower layers of the neural network and not continuously. Thirdly, the Saliency Removal (SR) method was proposed to uniformly evaluate the performance of CAM and the importance of activation features. By setting different removal thresholds and other perspectives, the objectivity of the evaluation is improved. Finally, based on the evaluation results, the FullGrad (Full Gradient-weighted) explanation model was used as the basis for the recognition judgment. Experimental results show that on the Archive_80 dataset, the accuracy of the proposed D_v2-DenseNet121 reaches 97.03%, which is 2.82 percentage points higher than that of DenseNet121 classification model. It can be seen that the introduction of deformable convolution endows the neural network model with the ability to extract invariant features and improves the accuracy of the classification model.

Key words: Deformable Convolutional Network (DCN), interpretability, butterfly species recognition, Class Activation Mapping (CAM), Saliency Removal (SR)

中图分类号: