基于自监督图像对的弱监督语义分割算法

doi:10.11772/j.issn.1001-9081.2022020304

《计算机应用》唯一官方网站

• • 下一篇

基于自监督图像对的弱监督语义分割算法

1.中国科学院成都计算机应用研究所
2.中国科学院大学计算机科学与技术学院
3.哈尔滨工业大学(深圳) 国际人工智能研究院
4.哈尔滨工业大学重庆研究院

收稿日期:2022-03-14 修回日期:2022-06-28 接受日期:2022-06-30 发布日期:2022-09-02 出版日期:2022-09-02
通讯作者: 陈斌
作者简介:侯孝振(1995—)，男，山东临沂人，硕士研究生，主要研究方向：弱监督的语义分割、视频内容理解；陈斌(1970—)，男，四川广汉人，研究员，博士，CCF会员，主要研究方向：工业检测、深度学习。

Weakly supervised semantic segmentation algorithm based on self-supervised image pairs

#br#

1.Chengdu Institute of Computer Applications, Chinese Academy of Sciences
2.University of Chinese Academy of Sciences
3.International Institute for Artificial Intelligence, Harbin Institute of Technology(Shenzhen)
4.Chongqing Research Institute, Harbin Institute of Technology

Received:2022-03-14 Revised:2022-06-28 Accepted:2022-06-30 Online:2022-09-02 Published:2022-09-02

摘要/Abstract

摘要： 为了减少人们在语义分割任务中的标注成本，提出了一种新的基于自监督图像对的弱监督语义分割算法Co-Net。首先，将一对图像分别输入到骨干网络中提取图片对特征；然后，将特征展开加入位置信息送入编码层中进行编码；接着，将编码特征送入到协同注意力模块(CoAM)以及自注意力模块(BiAM)中进行信息相互表征；最终，将图像区域掩码(MRM)以及图像对匹配(IPM)两种自监督任务用于网络训练，学习图像对中的全局关联以及局部关联，以此得到更加精确初始化种子。仅使用图像级标签进行弱监督语义分割，在Pascal VOC 2012验证和测试集上分别实现了69.8%和70.3%的mIoU，相较于同样为图像对输入的算法GroupWSSS(Group-Wise Semantic Mining for Weakly Supervised Semantic Segmentation)，验证集和测试集上分别提高了1.6个百分点和1.8个百分点的mIoU。实验结果表明，所提算法可以获得更加完整的目标激活区域。

关键词: 语义分割, 弱监督学习, 自监督学习, 弱监督的语义分割, 深度学习

Abstract: In order to reduce people’s annotation cost in semantic segmentation tasks,a new weakly supervised semantic segmentation algorithm Co-Net based on self-supervised image pairs was proposed. Fistly,a pair of images were respectively input into backbone network to extract image pair features. Secondly, expanded features were added to location information and sent to the encoding layer,and then the encoded features were fed into Collaborative Attention Module (CoAM) and Self-Attention Module(BiAM) for information mutual representation. Finally, two self-supervised tasks, image Region Masking (MRM) and Image Pair Matching (IPM) were used for network training to learn global and local associations in image pairs, so as to obtain more accurate initialization seeds. Weakly supervised semantic segmentation using only image-level labels achieves mIoU of 69.8% and 70.3% on Pascal VOC 2012 validation and test sets. respectively. Compared with the algorithm Group-Wise Semantic Mining for Weakly Supervised Semantic Segmentation(GroupWSSS), which is also input for image pairs, the mIoU is increased by 1.6 and 1.8 percentage points on the validation set and test set. The experimental results show that the proposed algorithm can obtain a more complete target activation area.

Key words: semantic segmentation, weakly supervised learning, self-supervised learning, weakly supervised semantic segmentation, deep learning

中图分类号:

TP183','1');return false;" target="_blank"> TP183

侯孝振陈斌. 基于自监督图像对的弱监督语义分割算法[J]. 计算机应用, DOI: 10.11772/j.issn.1001-9081.2022020304.

HOU Xiaozhen CHEN Bin. Weakly supervised semantic segmentation algorithm based on self-supervised image pairs[J]. Journal of Computer Applications, DOI: 10.11772/j.issn.1001-9081.2022020304.

参考文献

[1]CHEN L C, PAPANDREOU G, KOKKINOS I, et al. Semantic image segmentation with deep convolutional nets and fully connected crfs[EB/OL].[2016-06-07]. https://arxiv.org/pdf/1412.7062.pdf.
[2]LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation [C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2015: 3431-3440.
[3]CHEN L C, ZHU Y, PAPANDREOU G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation [C]// Proceedings of the 2018 European Conference on Computer Vision. Cham: Springer, 2018: 801-818.
[4]WU J, LI G, HAN X, et al. Reinforcement learning for weakly supervised temporal grounding of natural language in untrimmed videos [C]// Proceedings of the 28th ACM International Conference on Multimedia. New York: ACM, 2020: 1283-1291.
[5]WU J, ZHANG W, LI G, et al. Weakly-Supervised Spatio-Temporal Anomaly Detection in Surveillance Video[EB/OL].[2021-08-09].https://arxiv.org/pdf/2108.03825.pdf.
[6]DAI J, HE K, SUN J. Boxsup: Exploiting bounding boxes to supervise convolutional networks for semantic segmentation [C]// Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2015: 1635-1643.
[7]LIN D, DAI J, JIA J, et al. Scribblesup: Scribble-supervised convolutional networks for semantic segmentation [C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 3159-3167.
[8]BEARMAN A, RUSSAKOVSKY O, FERRARI V, et al. What’s the point: Semantic segmentation with point supervision [C]// Proceedings of the 2016 European Conference on Computer Vision. Cham: Springer, 2016: 549-565.
[9]CHANG Y T, WANG Q, HUNG W C, et al. Weakly-supervised semantic segmentation via sub-category exploration [C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 8991-9000.
[10]LEE S, LEE M, LEE J, et al. Railroad is not a train: Saliency as pseudo-pixel supervision for weakly supervised semantic segmentation [C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 5495-5505.
[11]CHAUDHRY A, DOKANIA P K, TORR P H S. Discovering class-specific pixels for weakly-supervised semantic segmentation [EB/OL].[2017-07-18].https://arxiv.org/pdf/1707.05821.pdf.
[12]CHOE J, SHIM H. Attention-based dropout layer for weakly supervised object localization [C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 2219-2228.
[13]LEE J, KIM E, LEE S, et al. Ficklenet: Weakly and semi-supervised semantic image segmentation using stochastic inference [C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 5267-5276.
[14]ZHOU B, KHOSLA A, LAPEDRIZA A, et al. Learning deep features for discriminative localization [C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 2921-2929.
[15]WANG Y, ZHANG J, KAN M, et al. Self-supervised equivariant attention mechanism for weakly supervised semantic segmentation [C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 12275-12284.
[16]WEI Y, FENG J, LIANG X, et al. Object region mining with adversarial erasing: A simple classification to semantic segmentation approach [C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 1568-1576.
[17]AHN J, KWAK S. Learning pixel-level semantic affinity with image-level supervision for weakly supervised semantic segmentation [C]// Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 4981-4990.
[18]HUANG Z, WANG X, WANG J, et al. Weakly-supervised semantic segmentation network with deep seeded region growing [C]// Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 7014-7023.
[19]LEE S, LEE J, LEE J, et al. Robust tumor localization with pyramid grad-CAM [EB/OL]. [2018-05-29].https://arxiv.org/pdf/1805.11393.pdf.
[20]HOU Q, JIANG P T, WEI Y, et al. Self-erasing network for integral object attention [EB/OL].[2019-06-26].https://arxiv.org/pdf/1810.09821v1.pdf.
[21]CHANG Y T, WANG Q, HUNG W C, et al. Weakly-supervised semantic segmentation via sub-category exploration [C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE,2020: 8991-9000.
[22]FAN J, ZHANG Z, SONG C, et al. Learning integral objects with intra-class discriminator for weakly-supervised semantic segmentation [C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE,2020: 4283-4292.
[23]FAN J, ZHANG Z, TAN T, et al. Cian: Cross-image affinity net for weakly supervised semantic segmentation [C]// Proceedings of the 2020 AAAI Conference on Artificial Intelligence. Menlo Park: AAAI, 2020, 34(07): 10762-10769.
[24]SUN G, WANG W, DAI J, et al. Mining cross-image semantics for weakly supervised semantic segmentation [C]// Proceedings of the 2020 European Conference on Computer Vision. Cham: Springer, 2020: 347-365.
[25]LI X, ZHOU T, LI J, et al. Group-wise semantic mining for weakly supervised semantic segmentation [EB/OL].[2020-11-09].https://arxiv.org/pdf/2012.05007.pdf.
[26]SHEN T, LIN G, SHEN C, et al. Bootstrapping the performance of webly supervised semantic segmentation [C]// Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 1363-1371.
[27]SHEN T, LIN G, LIU L, et al. Weakly supervised semantic segmentation based on web image co-segmentation [EB/OL].[2017-08-06].https://arxiv.org/pdf/1705.09052.pdf.
[28]DEVLIN J, CHANG M W, LEE K, et al. Bert: Pre-training of deep bidirectional transformers for language understanding [EB/OL].[2017-08-06].https://arxiv.org/pdf/1705.09052.pdf.
[29]Wang Y, Zhang J, Kan M, et al. Self-supervised equivariant attention mechanism for weakly supervised semantic segmentation[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 12275-12284.
[30]VERNAZA P, CHANDRAKER M. Learning random-walk label propagation for weakly-supervised semantic segmentation [C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 7158-7166.
[31]SHIMODA W, YANAI K. Self-supervised difference detection for weakly-supervised semantic segmentation [C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 5208-5217.
[32]LI K, ZHANG Y, LI K, et al. Attention bridging network for knowledge transfer [C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 5198-5207.
[33]PINHEIRO P O, COLLOBERT R. From image-level to pixel-level labeling with convolutional networks [C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2015: 1713-1721.
[34]HONG S, YEO D, KWAK S, et al. Weakly supervised semantic segmentation using web-crawled videos [C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 7322-7330.
[35]CHEN H, HUANG Y, NAKAYAMA H. Semantic aware attention based deep object co-segmentation [C]// Proceedings of the 2018 Asian Conference on Computer Vision. Cham: Springer, 2018: 435-450.
[36]LI W, HOSSEINI JAFARI O, ROTHER C. Deep object co-segmentation [C]// Proceedings of the 2018 Asian Conference on Computer Vision. Cham: Springer, 2018: 638-653.
[37]LIU Y, OTT M, GOYAL N, et al. Roberta: A robustly optimized bert pretraining approach [EB/OL].[2019-06-26].https://arxiv.org/pdf/1907.11692.pdf.
[38]HARIHARAN B, ARBELÁEZ P, BOURDEV L, et al. Semantic contours from inverse detectors [C]// Proceedings of the 2011 International Conference on Computer Vision. Piscataway: IEEE, 2011: 991-998.
[39]JIANG H, WANG J, YUAN Z, et al. Salient object detection: A discriminative regional feature integration approach [C]// Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2013: 2083-2090.
[40]CHEN L C, PAPANDREOU G, KOKKINOS I, et al. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 40(4): 834-848.
[41]KRÄHENBÜHL P, KOLTUN V. Efficient inference in fully connected crfs with gaussian edge potentials[J]. Advances in Neural Information Processing Systems, 2011, 24.
[42]GE W, YANG S, YU Y. Multi-evidence filtering and fusion for multi-label classification, object detection and semantic segmentation based on weakly supervised learning [C]// Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 1277-1286.
[43]LI K, WU Z, PENG K C, et al. Tell me where to look: Guided attention inference network [C]// Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 9215-9223.
[44]WEI Y, XIAO H, SHI H, et al. Revisiting dilated convolution: a simple approach for weakly-and semi-supervised semantic segmentation [C]// Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 7268-7277.
[45]ZHANG B, XIAO J, WEI Y, et al. Reliability does matter: An end-to-end weakly supervised semantic segmentation approach [C]// Proceedings of the 2020 AAAI Conference on Artificial Intelligence. Menlo Park: AAAI, 2020, 34(07): 12765-12772.
[46]JIANG P T, HOU Q, CAO Y, et al. Integral object mining via online attention accumulation [C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 2070-2079.
[47]LEE J, CHOI J, MOK J, et al. Reducing Information bottleneck for weakly supervised semantic segmentation[J]. Advances in Neural Information Processing Systems, 2021, 34.
[48]LI Y, DUAN Y, KUANG Z, et al. Uncertainty Estimation via Response Scaling for Pseudo-mask Noise Mitigation in Weakly-supervised Semantic Segmentation[EB/OL].[2021-12-14].https://arxiv.org/pdf/2112.07431.pdf.
[49]LEE S, LEE M, LEE J, et al. Railroad is not a train: Saliency as pseudo-pixel supervision for weakly supervised semantic segmentation [C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 5495-5505.
[50]KIM B, HAN S, KIM J. Discriminative region suppression for weakly-supervised semantic segmentation [C]// Proceedings of the 2021 AAAI Conference on Artificial Intelligence. Menlo Park: AAAI, 2021, 35(2): 1754-1761.

基于自监督图像对的弱监督语义分割算法

Weakly supervised semantic segmentation algorithm based on self-supervised image pairs

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 0

编辑推荐

Metrics