计算机应用 ›› 2021, Vol. 41 ›› Issue (7): 2076-2081.DOI: 10.11772/j.issn.1001-9081.2020081308

所属专题: 多媒体计算与计算机仿真

• 多媒体计算与计算机仿真 • 上一篇    下一篇

基于混合注意力模型的阴影检测方法

谭道强1, 曾诚1,2,3, 乔金霞1, 张俊1   

  1. 1. 湖北大学 计算机与信息工程学院, 武汉 430062;
    2. 湖北省软件工程工程技术研究中心(湖北大学), 武汉 430062;
    3. 智慧政务与人工智能应用湖北省工程研究中心(湖北大学), 武汉 430062
  • 收稿日期:2020-08-27 修回日期:2020-12-24 出版日期:2021-07-10 发布日期:2021-01-11
  • 通讯作者: 曾诚
  • 作者简介:谭道强(1995-),男,湖北咸宁人,硕士研究生,CCF会员主要研究方向:图像处理、阴影检测;曾诚(1976-),男,湖北武汉人,教授,博士,CCF会员,主要研究方向:人工智能、行业软件;乔金霞(1997-),女,山西高平人,硕士研究生,主要研究方向:知识图谱、推荐系统;张俊(1998-),男,湖北荆州人,硕士研究生,主要研究方向:人工智能、信息隐藏。
  • 基金资助:
    国家自然科学基金面上项目(61977021);国家自然科学基金青年科学基金项目资助项目(61902114);湖北省2019年技术创新专项(2019ACA144)。

Shadow detection method based on hybrid attention model

TAN Daoqiang1, ZENG Cheng1,2,3, QIAO Jinxia1, ZHANG Jun1   

  1. 1. School of Computer Science and Information Engineering, Hubei University, Wuhan Hubei 430062, China;
    2. Engineering and Technology Research Center for Hubei Province in Software Engineering(Hubei University), Wuhan Hubei 430062, China;
    3. Hubei Engineering Research Center for Hubei Province in Intelligent Government Affairs and Application of Artificial Intelligence(Hubei University), Wuhan Hubei 430062, China
  • Received:2020-08-27 Revised:2020-12-24 Online:2021-07-10 Published:2021-01-11
  • Supported by:
    This work is partially supported by the Surface Program of the National Natural Science Foundation of China (61977021), the Youth Program of the National Natural Science Foundation of China (61902114), the Hubei Province 2019 Technology Innovation Project (2019ACA144).

摘要: 图像中阴影区域的存在会导致图像内容的不确定性,不利于其他计算机视觉任务,因此常将阴影检测作为计算机视觉算法的预处理过程。然而,现有的阴影检测算法大多采用多级网络结构,导致模型训练困难,虽然已经提出了一些采用单级网络结构的算法,但它们只关注了局部的阴影,忽略了阴影之间的联系。针对该问题,为提升阴影检测的准确率和鲁棒性,提出了基于混合注意力模型的阴影检测方法。首先将预训练后的深层网络ResNext101作为前端特征提取网络,提取图像的基本特征;其次采用双向金字塔结构由浅入深、由深到浅的方式进行特征融合,并提出信息补偿机制减少深层语义信息丢失;然后结合空间注意力和通道注意力提出混合注意力模型进行特征融合,捕捉阴影区域和非阴影区域的差异;最后融合两个方向的预测结果从而得到最终的阴影检测结果。在公开数据集SBU、UCF上对所提方法进行可行性对比实验,结果表明,相较于DSC算法,所提方法的平衡误差率(BER)分别降低了30%和11%,说明它能够较好地抑制阴影错误检测并增强阴影细节。

关键词: 阴影检测, 卷积神经网络, 空间注意力, 通道注意力, 信息补偿机制, 双向金字塔结构

Abstract: The shadow regions in an image may lead to uncertainty of the image content, which is not conducive to other computer vision tasks, so shadow detection is often considered as a pre-processing process of computer vision algorithms. However, most of the existing shadow detection algorithms use a multi-level network structure, which leads to difficulties in model training, and although some algorithms adopting single-layer network structure have been proposed, they only focus on local shadows and ignore the relation between shadows. To solve this problem, a shadow detection algorithm based on hybrid attention model was proposed to improve the accuracy and robustness of shadow detection. Firstly, the pre-trained deep network ResNext101 was used as the front-end feature extraction network to extract the basic features of the image. Secondly, the bidirectional pyramid structure was used for feature fusion from shallow to deep and deep to shallow, and an information compensation mechanism was proposed to reduce the loss of deep semantic information. Thirdly, a hybrid attention model was proposed for feature fusion by combining spatial attention and channel attention, so as to capture differences between shaded and non-shaded regions. Finally, the prediction results of two directions were merged to obtain the final shadow detection result. Comparison experiments were conducted on public datasets SBU and UCF. The results show that compared with DSC (Direction-aware Spatial Context) algorithm, the Balance Error Rate (BER) of the proposed algorithm is reduced by 30% and 11% respectively, proving that the proposed method can better suppress shadow error detection and enhance shadow details.

Key words: shadow detection, Convolutional Neural Network (CNN), spatial attention, channel attention, information compensation mechanism, bidirectional pyramid structure

中图分类号: