Salient object detection based on difference of Gaussian feature network

doi:10.11772/j.issn.1001-9081.2020060957

Abstract

Abstract: As a clue with physiological basis, the center-surround contrast theory has been widely used in traditional saliency detection models. However, this theory is rarely applied to models based on deep Convolutional Neural Network (CNN) explicitly. In order to introduce the classic center-surround contrast theory into deep CNN, a salient object detection model based on Difference of Gaussian (DoG) feature network was proposed. Firstly, a Difference of Gaussian Pyramid (DGP) structure was constructed on the deep features of multiple scales to perceive the local prominent features of salient object in an image. Then, the obtained differential feature were used to perform weighted selection to the deep features with rich semantic information. Finally, the accurate extraction of the salient object was realized. In addition, the Gaussian smoothing process was implemented by using standard one-dimensional convolution in the proposed network design, so as to reduce the computational complexity and realize the end-to-end training of the network at the same time. Through comparison of the proposed model and six salient object detection algorithms on four public datasets, it can be seen that the results obtained by the proposed model achieve the best performance in the quantitative evaluation of Mean Absolute Error (MAE) and maximum F-measure. Especially on the DUTS-TE dataset the maximum F-measure and the mean absolute error of the results of the proposed model reach 0.885 and 0.039 respectively. Experimental results show that the proposed model has good detection performance for salient objects in complex natural scenes.

Key words: salient object detection, Difference of Gaussian Pyramid (DGP), Center-Surround Contrast (CSC), feature fusion, Convolutional Neural Network (CNN)

摘要： 中心-邻域对比度理论作为具有生理学依据的一种线索，在传统显著性检测模型中获得了广泛应用，然而该理论却很少显式地应用在基于深度卷积神经网络（CNN）的模型中。为了将经典的中心-邻域对比度理论引入深度卷积网络中，提出了一种基于高斯差分（DoG）特征网络的显著目标检测模型。首先通过在多个尺度的深度特征上构造高斯差分金字塔（DGP）结构以感知图像中显著目标的局部突出特性，进而用所得到的差分特征对语义信息丰富的深度特征进行加权选择，最终实现对显著目标的准确提取。进一步地，在提出的网络设计中采用标准的一维卷积来实现高斯平滑过程，从而在降低计算复杂度的同时实现了网络端到端的训练。通过把所提模型与六种显著目标检测算法在四个公用数据集上的实验结果进行对比，可知所提模型取得的结果在平均绝对误差（MAE）和最大F度量值的定量评价中均取得了最优表现，尤其是在DUTS-TE数据集上所提模型取得的结果的最大F度量值和平均绝对误差分别达到了0.885和0.039。实验结果表明，所提模型在复杂自然场景中对于显著目标具有良好的检测性能。

关键词: 显著目标检测, 高斯差分金字塔, 中心-邻域对比度, 特征融合, 卷积神经网络

CLC Number:

TP391.4

HOU Yunlong, ZHU Lei, CHEN Qin, LYU Suidong. Salient object detection based on difference of Gaussian feature network[J]. Journal of Computer Applications, 2021, 41(3): 706-713.

后云龙, 朱磊, 陈琴, 吕燧栋. 基于高斯差分特征网络的显著目标检测[J]. 计算机应用, 2021, 41(3): 706-713.

References

[1] 张文达, 许悦雷, 倪嘉成, 等. 基于多尺度分块卷积神经网络的图像目标识别算法[J]. 计算机应用,2016,36(4):1033-1038. (ZHANG W D,XU Y L,NI J C,et al. Image target recognition method based on multi-scale block convolutional neural network[J]. Journal of Computer Applications,2016,36(4):1033-1038.)
[2] 许玥, 冯梦如, 皮家甜, 等. 基于深度学习模型的遥感图像分割方法[J]. 计算机应用,2019,39(10):2905-2914.(XU Y, FENG M R,PI J T,et al. Remote sensing image segmentation method based on deep learning model[J]. Journal of Computer Applications,2019,39(10):2905-2914.)
[3] ITTI L,KOCH C,NIEBUR E. A model of saliency-based visual attention for rapid scene analysis[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,1998,20(11):1254-1259.
[4] GOFERMAN S, ZELNIK-MANOR L, TAL A. Context-aware saliency detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2012,34(10):1915-1926.
[5] KLEIN D A,FRINTROP S. Center-surround divergence of feature statistics for salient object detection[C]//Proceedings of the 2011 IEEE International Conference on Computer Vision. Piscataway:IEEE,2011:2214-2219.
[6] LIU T,YUAN Z,SUN J,et al. Learning to detect a salient object[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2011,33(2):353-367.
[7] RICHE N,MANCAS M,GOSSELIN B,et al. RARE:a new bottom-up saliency model[C]//Proceedings of the 19th IEEE International Conference on Image Processing. Piscataway:IEEE, 2012:641-644.
[8] SEO H J, MILANFAR P. Nonparametric bottom-up saliency detection by self-resemblance[C]//Proceedings of the 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops. Piscataway:IEEE,2009:45-52.
[9] LI Y,ZHOU Y,YAN J,et al. Visual saliency based on conditional entropy[C]//Proceedings of the 2009 Asian Conference on Computer Vision,LNCS 5994. Berlin:Springer,2009:246-257.
[10] WANG K,LIN L,LU J,et al. PISA:pixelwise image saliency by aggregating complementary appearance contrast measures with edge-preserving coherence[J]. IEEE Transactions on Image Processing,2015,24(10):3019-3033.
[11] LONG J,SHELHAMER E,DARRELL T. Fully convolutional networks for semantic segmentation[C]//Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2015:3431-3440.
[12] LI G,YU Y. Deep contrast learning for salient object detection[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2016:478-487.
[13] CHEN S,TAN X,WANG B,et al. Reverse attention for salient object detection[C]//Proceedings of the 2018 European Conference on Computer Vision,LNCS 11213. Cham:Springer, 2018:236-252.
[14] XIE S,TU Z. Holistically-nested edge detection[C]//Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway:IEEE,2015:1395-1403.
[15] WANG T,BORJI A,ZHANG L,et al. A stagewise refinement model for detecting salient objects in images[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway:IEEE,2017:4039-4048.
[16] CHEN L C,ZHU Y,PAPANDREOU G,et al. Encoder-decoder with atrous separable convolution for semantic image segmentation[C]//Proceedings of the 2018 European Conference on Computer Vision,LNCS 11211. Cham:Springer,2018:833-851.
[17] ZHAO H,SHI J,QI X,et al. Pyramid scene parsing network[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2017:6230-6239.
[18] ZHU Z,XU M,BAI S,et al. Asymmetric non-local neural networks for semantic segmentation[C]//Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway:IEEE,2019:593-602.
[19] FU J,LIU J,TIAN H,et al. Dual attention network for scene segmentation[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2019:3141-3149.
[20] REN G,DAI T,BARMPOUTIS P,et al. Salient object detection combining a self-attention module and a feature pyramid network[EB/OL].[2020-05-31]. https://arxiv.org/pdf/2004.14552.pdf.
[21] LI G,YU Y. Visual saliency based on multiscale deep features[C]//Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2015:5455-5463.
[22] LI X,ZHAO L,WEI L,et al. DeepSaliency:multi-task deep neural network model for salient object detection[J]. IEEE Transactions on Image Processing,2016,25(8):3919-3930.
[23] WANG L,LU H,RUAN X,et al. Deep networks for saliency detection via local estimation and global search[C]//Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2015:3183-3192.
[24] FREDERIC P. MILLER,AGNES F. et al. Scale-invariant feature transform[J]. Scholarpedia,2009,7(5):10491.
[25] YOUNG I T,VAN VLIET L J. Recursive implementation of the Gaussian filter[J]. Signal Processing,1995,44(2):139-151.
[26] HE K,ZHANG X,REN S,et al. Deep residual learning for image recognition[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2016:770-778.
[27] LIN T Y,DOLLÁR P,GIRSHICK R,et al. Feature pyramid networks for object detection[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2017:936-944.
[28] LIN G, MILAN A, SHEN C, et al. RefineNet:multi-path refinement networks for high-resolution semantic segmentation[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2017:5168-5177.
[29] YANG C,ZHANG L,LU H,et al. Saliency detection via graphbased manifold ranking[C]//Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2013:3166-3173.
[30] WANG L,LU H,WANG Y,et al. Learning to detect salient objects with image-level supervision[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2017:3796-3805.
[31] LI Y,HOU X,KOCH C,et al. The secrets of salient object segmentation[C]//Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2014:280-287.
[32] MOVAHEDI V,ELDER J H. Design and perceptual validation of performance measures for salient object segmentation[C]//Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops. Piscataway:IEEE,2010:49-56.
[33] ACHANTA R,HEMAMI S,ESTRADA F,et al. Frequencytuned salient region detection[C]//Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2009:1597-1604.
[34] LUO Z,MISHRA A,ACHKAR A,et al. Non-local deep features for salient object detection[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2017:6593-6601.
[35] HOU Q,CHENG M,HU X,et al. Deeply supervised salient object detection with short connections[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2017:5300-5309.
[36] WANG T,ZHANG L,WANG S,et al. Detect globally,refine locally:a novel approach to saliency detection[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2018:3127-3135.
[37] LIU N,HAN J,YANG M. H. PiCANet:learning pixel-wise contextual attention for sliency detection[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2018:3089-3098.
[38] WANG W,ZHAO S,SHEN J,et al. Salient object detection with pyramid attention and salient edges[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2019:1448-1457.
[39] QIN X,ZHANG Z,HUANG C,et al. BASNet:boundary-aware salient object detection[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2019:7471-7481.
[40] SIMONYAN K,ZISSEMAN A. Very deep convolutional networks for large-scale image recognition[EB/OL].[2020-04-10]. https://arxiv.org/pdf/1409.1556.pdf.