《计算机应用》唯一官方网站 ›› 2025, Vol. 45 ›› Issue (10): 3328-3335.DOI: 10.11772/j.issn.1001-9081.2024091324

• 多媒体计算与计算机仿真 • 上一篇    

边界挖掘和背景引导的伪装目标检测

李钟华1,2, 钟庚辛1,2, 范萍1,2, 朱恒亮1,2()   

  1. 1.福建省大数据挖掘与应用技术重点实验室(福建理工大学),福州 350018
    2.福建理工大学 计算机科学与数学学院,福州 350018
  • 收稿日期:2024-09-20 修回日期:2024-11-18 接受日期:2024-11-22 发布日期:2025-01-13 出版日期:2025-10-10
  • 通讯作者: 朱恒亮
  • 作者简介:李钟华(1976—),男,福建南平人,副教授,博士,CCF会员,主要研究方向:图像处理、人工智能
    钟庚辛(1999—),男,江西赣州人,硕士研究生,主要研究方向:伪装目标检测
    范萍(1996—),男,福建龙岩人,硕士研究生,主要研究方向:表情识别
    朱恒亮(1982—),男,福建福州人,讲师,博士,CCF会员,主要研究方向:目标检测、深度学习。 Email:hengliang_zhu@fjut.edu.cn
  • 基金资助:
    福建省自然科学基金资助项目(2023J01348);福建省自然科学基金资助项目(2022J01954);福建理工大学科技项目(GY-Z220205)

Camouflaged object detection by boundary mining and background guidance

Zhonghua LI1,2, Gengxin ZHONG1,2, Ping FAN1,2, Hengliang ZHU1,2()   

  1. 1.Fujian Provincial Key Laboratory of Big Data Mining and Applications (Fujian University of Technology),Fuzhou Fujian 350018,China
    2.School of Computer Science and Mathematics,Fujian University of Technology,Fuzhou Fujian 350018,China
  • Received:2024-09-20 Revised:2024-11-18 Accepted:2024-11-22 Online:2025-01-13 Published:2025-10-10
  • Contact: Hengliang ZHU
  • About author:LI Zhonghua, born in 1976, Ph. D., associate professor. His research interests include image processing, artificial intelligence.
    ZHONG Gengxin, born in 1999, M. S. candidate. His research interests include camouflaged object detection.
    FAN Ping, born in 1996, M. S. candidate. His research interests include facial expression recognition.
    ZHU Hengliang, born in 1982, Ph. D., lecturer. His research interests include object detection, deep learning.
  • Supported by:
    Natural Science Foundation of Fujian Province(2023J01348);Science and Technology Project of Fujian University of Technology(GY-Z220205)

摘要:

伪装目标与背景具有高度的相似性,极易受背景特征混淆,导致边界信息难以分辨且提取目标特征困难。目前主流的伪装目标检测(COD)算法主要针对性研究伪装目标本身及其边界行,忽略了图像背景与目标的相互关系,在复杂场景下的检测结果不理想。为了探索背景和目标的潜在联系,提出一种通过挖掘边界和背景检测伪装目标的算法——I2DNet(Indirect to Direct Network)。该算法由5个部分组成:编码器,处理初始原始数据;边界指导的特征提取和挖掘框架,通过特征处理和特征挖掘提取更多精细的边界特征;背景引导的潜在特征学习框架,通过多尺度卷积探索更多的显著特征,同时基于注意力设计混合注意力模块(HAM),增强对背景特征的强化选择;信息补偿模块(ISM),弥补在特征处理过程中损失的细节信息;多任务协同分割解码器(MCD)则高效融合不同任务和模块提取的特征,并输出最终的预测结果。在广泛使用的3个数据集上的实验结果表明,所提算法优于其他15个先进模型,尤其在CAMO数据集上的平均绝对误差指标下降至0.042。

关键词: 伪装目标检测, 反向引导, 多尺度卷积, 注意力机制, 特征聚合

Abstract:

Since the camouflaged object is highly similar to the background, it is easily confused by background features, making it difficult to distinguish boundary information and extract object features. Current mainstream Camouflaged Object Detection (COD) algorithms mainly study the camouflage object itself and its boundaries, ignoring relationship between the image background and the object, and the detection results are not ideal in complex scenes. To this end, in order to explore potential connection between background and object, an camouflaged object detection algorithm by mining boundaries and background was proposed, called I2DNet (Indirect to Direct Network). The algorithm consists of five parts: in the encoder, the initial raw data was processed; in the Boundary-guided feature Extracting and Mining Framework (BEMF), more refined boundary features were extracted through feature processing and feature mining; in the Latent-feature Exploring Framework based on Background guidance (LEFB), more salient features were explored through multi-scale convolution while based on attention, the Hybrid Attention Module (HAM) was designed to enhance selection of background features; in the Information Supplement Module (ISM), the detailed information lost during feature processing was made up; in the Multi-task Co-segmentation Decoder (MCD), the features extracted from different tasks and modules were fused efficiently and the final prediction results were output. Experimental results show that the proposed algorithm is better than the other 15 state-of-the-art models on three widely used datasets; especially on CAMO dataset, the proposed algorithm has the mean absolute error index dropped to 0.042.

Key words: Camouflaged Object Detection (COD), reverse guidance, multi-scale convolution, attention mechanism, feature aggregation

中图分类号: