《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (3): 844-853.DOI: 10.11772/j.issn.1001-9081.2021030392
• 人工智能 • 上一篇
收稿日期:
2021-03-16
修回日期:
2021-05-16
接受日期:
2021-05-31
发布日期:
2022-04-09
出版日期:
2022-03-10
通讯作者:
万源
作者简介:
余娜(2000—),女,湖北咸宁人,CCF会员,主要研究方向:机器学习、深度学习、图像语义分割基金资助:
Na YU, Yan LIU, Xiongju WEI, Yuan WAN()
Received:
2021-03-16
Revised:
2021-05-16
Accepted:
2021-05-31
Online:
2022-04-09
Published:
2022-03-10
Contact:
Yuan WAN
About author:
YU Na, born in 2000. Her research interests include machine learning, deep learning, semantic image segmentation.Supported by:
摘要:
针对现有RGB-D室内场景语义分割不能有效融合多模态特征的问题,提出一种基于注意力机制和金字塔融合的RGB-D室内场景图像语义分割网络模型APFNet,并为其设计了两个新模块:注意力机制融合模块与金字塔融合模块。其中,注意力机制融合模块分别提取RGB特征和Depth特征的注意力分配权重,充分利用两种特征的互补性,使网络聚焦于信息含量更高的多模态特征域;金字塔融合模块利用四种不同金字塔尺度特征,融合局部与全局信息,提取场景语境,提升物体边缘和小尺度物体的分割精度。将这两个融合模块整合到一个包含三个分支的“编码器-解码器”网络中,实现“端到端”输出。该模型在SUN RGB-D和NYU Depth v2数据集上与多层残差特征融合网络(RDF-152)、注意力互补网络(ACNet)、空间信息引导卷积网络(SGNet)等先进方法进行实验对比。实验结果表明,与最好的表现方法RDF-152对比,APFNet的编码器网络层数从152层降低到50层的情况下,像素精度(PA)、平均像素精度(MPA)、平均交并比(MIoU)分别提升了0.4、1.1、3.2个百分点,并对枕头、照片等小尺度物体和木板、天花板等大尺度物体的语义分割质量分别有0.9~4.5和12.4~18个百分点的提升;故该模型在处理室内场景语义分割问题上具有一定的优势。
中图分类号:
余娜, 刘彦, 魏雄炬, 万源. 基于注意力机制和金字塔融合的RGB-D室内场景语义分割[J]. 计算机应用, 2022, 42(3): 844-853.
Na YU, Yan LIU, Xiongju WEI, Yuan WAN. Semantic segmentation of RGB-D indoor scenes based on attention mechanism and pyramid fusion[J]. Journal of Computer Applications, 2022, 42(3): 844-853.
图1 基于注意力机制和金字塔融合的RGB-D语义分割APFNet网络模型整体架构
Fig. 1 Overall architecture of RGB-D semantic segmentation APFNet model based on attention mechanism and pyramid fusion
算法 | PA | MPA | MIoU | 算法 | PA | MPA | MIoU |
---|---|---|---|---|---|---|---|
Fuse-SP5[ | 76.3 | 48.3 | 37.3 | RedNet[ | 81.3 | 60.3 | 47.8 |
DFCN-DCRF[ | 76.6 | 50.6 | 39.3 | ACNet[ | — | — | 48.1 |
RDF-152[ | 81.5 | 60.1 | 47.7 | DSPPNet[ | 72.9 | 42.0 | 32.5 |
3DGNN[ | — | 57.0 | 45.9 | SGNet[ | 81.0 | 59.8 | 47.5 |
CFN-152[ | — | — | 48.1 | APFNet | 81.9 | 61.2 | 50.9 |
DCNN[ | — | 53.5 | 42.0 |
表1 各算法在SUN-RGBD上PA、MPA、MIoU比较 ( %)
Tab. 1 PA, MPA and MIoU comparison of different algorithms on SUN-RGBD
算法 | PA | MPA | MIoU | 算法 | PA | MPA | MIoU |
---|---|---|---|---|---|---|---|
Fuse-SP5[ | 76.3 | 48.3 | 37.3 | RedNet[ | 81.3 | 60.3 | 47.8 |
DFCN-DCRF[ | 76.6 | 50.6 | 39.3 | ACNet[ | — | — | 48.1 |
RDF-152[ | 81.5 | 60.1 | 47.7 | DSPPNet[ | 72.9 | 42.0 | 32.5 |
3DGNN[ | — | 57.0 | 45.9 | SGNet[ | 81.0 | 59.8 | 47.5 |
CFN-152[ | — | — | 48.1 | APFNet | 81.9 | 61.2 | 50.9 |
DCNN[ | — | 53.5 | 42.0 |
算法 | PA | MPA | MIoU | 算法 | PA | MPA | MIoU |
---|---|---|---|---|---|---|---|
LSD-GF[ | 71.9 | 60.7 | 45.9 | ACNet[ | — | — | 48.3 |
RDF-50[ | 74.8 | 60.4 | 47.7 | CTNet[ | 76.3 | — | 50.6 |
RDF-152[ | 76.0 | 62.8 | 50.1 | TSNet[ | 73.5 | 59.6 | 46.1 |
MSNet[ | 72.9 | 57.9 | — | SGNet[ | 76.1 | 62.7 | 50.2 |
SCN-152[ | — | — | 49.6 | APFNet | 76.9 | 63.2 | 52.3 |
表2 各算法在NYU Depth v2上PA、MPA、MIoU比较 (%)
Tab. 2 PA, MPA and MIoU comparison of different algorithms in NYU Depth v2
算法 | PA | MPA | MIoU | 算法 | PA | MPA | MIoU |
---|---|---|---|---|---|---|---|
LSD-GF[ | 71.9 | 60.7 | 45.9 | ACNet[ | — | — | 48.3 |
RDF-50[ | 74.8 | 60.4 | 47.7 | CTNet[ | 76.3 | — | 50.6 |
RDF-152[ | 76.0 | 62.8 | 50.1 | TSNet[ | 73.5 | 59.6 | 46.1 |
MSNet[ | 72.9 | 57.9 | — | SGNet[ | 76.1 | 62.7 | 50.2 |
SCN-152[ | — | — | 49.6 | APFNet | 76.9 | 63.2 | 52.3 |
算法 | wall | floor | cabinet | bed | chair | sofa | table | door | window | bkshelf |
---|---|---|---|---|---|---|---|---|---|---|
RDF-101[ | 78.8 | 87.3 | 63.0 | 71.6 | 65.1 | 62.8 | 49.7 | 39.5 | 48.5 | 46.5 |
RDF-152[ | 79.7 | 87.0 | 60.9 | 73.4 | 64.6 | 65.4 | 50.7 | 39.9 | 49.6 | 44.9 |
APFNet | 80.0 | 87.4 | 58.6 | 74.6 | 66.4 | 61.3 | 49.6 | 43.2 | 50.2 | 47.3 |
算法 | picture | counter | blind | desk | shelf | curtain | dresser | pillow | mirror | mat |
RDF-101[ | 60.8 | 65.5 | 61.5 | 30.8 | 12.4 | 54.0 | 54.0 | 46.6 | 55.5 | 41.6 |
RDF-152[ | 61.2 | 67.1 | 63.9 | 28.6 | 14.2 | 59.7 | 49.0 | 49.9 | 54.3 | 39.4 |
APFNet | 62.1 | 67.2 | 60.1 | 28.8 | 15.7 | 55.6 | 49.2 | 50.3 | 46.5 | 42.7 |
算法 | cloths | ceiling | books | refridg | tv | paper | towel | shower | box | board |
RDF-101[ | 26.3 | 69.7 | 36.0 | 55.7 | 63.2 | 34.6 | 39.1 | 38.5 | 13.1 | 46.0 |
RDF-152[ | 26.9 | 69.1 | 35.0 | 58.9 | 63.8 | 34.1 | 41.6 | 38.5 | 11.6 | 54.0 |
APFNet | 27.4 | 75.9 | 33.7 | 54.2 | 58.3 | 36.2 | 42.9 | 35.6 | 14.6 | 72.0 |
算法 | person | stand | toilet | sink | lamp | bathtub | bag | othstr | othfurn | othprop |
RDF-101[ | 81.8 | 42.5 | 68.9 | 56.1 | 45.8 | 49.0 | 13.4 | 31.0 | 19.5 | 38.6 |
RDF-152[ | 80.0 | 45.3 | 65.7 | 62.1 | 47.1 | 57.3 | 19.1 | 30.7 | 20.6 | 39.0 |
APFNet | 81.9 | 47.1 | 78.1 | 63.0 | 50.5 | 61.8 | 14.9 | 32.3 | 19.3 | 39.1 |
表3 NYU Depth v2数据集中40个类的IoU的比较结果 ( %)
Tab. 3 Comparison of IoU results of 40 classes in NYU Depth v2 dataset
算法 | wall | floor | cabinet | bed | chair | sofa | table | door | window | bkshelf |
---|---|---|---|---|---|---|---|---|---|---|
RDF-101[ | 78.8 | 87.3 | 63.0 | 71.6 | 65.1 | 62.8 | 49.7 | 39.5 | 48.5 | 46.5 |
RDF-152[ | 79.7 | 87.0 | 60.9 | 73.4 | 64.6 | 65.4 | 50.7 | 39.9 | 49.6 | 44.9 |
APFNet | 80.0 | 87.4 | 58.6 | 74.6 | 66.4 | 61.3 | 49.6 | 43.2 | 50.2 | 47.3 |
算法 | picture | counter | blind | desk | shelf | curtain | dresser | pillow | mirror | mat |
RDF-101[ | 60.8 | 65.5 | 61.5 | 30.8 | 12.4 | 54.0 | 54.0 | 46.6 | 55.5 | 41.6 |
RDF-152[ | 61.2 | 67.1 | 63.9 | 28.6 | 14.2 | 59.7 | 49.0 | 49.9 | 54.3 | 39.4 |
APFNet | 62.1 | 67.2 | 60.1 | 28.8 | 15.7 | 55.6 | 49.2 | 50.3 | 46.5 | 42.7 |
算法 | cloths | ceiling | books | refridg | tv | paper | towel | shower | box | board |
RDF-101[ | 26.3 | 69.7 | 36.0 | 55.7 | 63.2 | 34.6 | 39.1 | 38.5 | 13.1 | 46.0 |
RDF-152[ | 26.9 | 69.1 | 35.0 | 58.9 | 63.8 | 34.1 | 41.6 | 38.5 | 11.6 | 54.0 |
APFNet | 27.4 | 75.9 | 33.7 | 54.2 | 58.3 | 36.2 | 42.9 | 35.6 | 14.6 | 72.0 |
算法 | person | stand | toilet | sink | lamp | bathtub | bag | othstr | othfurn | othprop |
RDF-101[ | 81.8 | 42.5 | 68.9 | 56.1 | 45.8 | 49.0 | 13.4 | 31.0 | 19.5 | 38.6 |
RDF-152[ | 80.0 | 45.3 | 65.7 | 62.1 | 47.1 | 57.3 | 19.1 | 30.7 | 20.6 | 39.0 |
APFNet | 81.9 | 47.1 | 78.1 | 63.0 | 50.5 | 61.8 | 14.9 | 32.3 | 19.3 | 39.1 |
算法 | 模块 | PA | MPA | MIoU | |
---|---|---|---|---|---|
AMFM | PFM | ||||
concat | × | × | 73.7 | 57.5 | 46.90 |
APFNet | × | √ | 75.4 | 61.4 | 49.50 |
√ | × | 75.1 | 60.9 | 50.30 | |
√ | √ | 76.9 | 63.2 | 52.26 |
表4 两个融合模块对算法模型PA、MPA、MIoU的影响 ( %)
Tab. 4 Impact of two fusion modules on PA, MPA and MIoU
算法 | 模块 | PA | MPA | MIoU | |
---|---|---|---|---|---|
AMFM | PFM | ||||
concat | × | × | 73.7 | 57.5 | 46.90 |
APFNet | × | √ | 75.4 | 61.4 | 49.50 |
√ | × | 75.1 | 60.9 | 50.30 | |
√ | √ | 76.9 | 63.2 | 52.26 |
算法 | 模块 | 内存大小/MB | 运行时间/ms | |
---|---|---|---|---|
AMFM | PFM | |||
concat | 903 | 508 | ||
APFNet | √ | 972 | 518 | |
APFNet | √ | 938 | 535 | |
APFNet | √ | √ | 1 016 | 543 |
RDFNet-50[ | 989 | 524 | ||
TSNet[ | 963 | 629 |
表5 不同算法的占用内存和运行时间对比
Tab. 5 Comparison of model size and operation time for different algorithms
算法 | 模块 | 内存大小/MB | 运行时间/ms | |
---|---|---|---|---|
AMFM | PFM | |||
concat | 903 | 508 | ||
APFNet | √ | 972 | 518 | |
APFNet | √ | 938 | 535 | |
APFNet | √ | √ | 1 016 | 543 |
RDFNet-50[ | 989 | 524 | ||
TSNet[ | 963 | 629 |
1 | 田萱,王亮,丁琪.基于深度学习的图像语义分割方法综述[J].软件学报,2019,30(2):440-468. |
TIAN X, WANG L, DING Q. Review of image semantic segmentation based on deep learning[J]. Journal of Software, 2019, 30(2): 440-468. | |
2 | 徐辉,祝玉华,甄彤,等.深度神经网络图像语义分割方法综述[J].计算机科学与探索,2021,15(1):47-59. |
XU H, ZHU Y H, ZHEN T, et al. Survey of image semantic segmentation methods based on deep neural network[J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(1): 47-59. | |
3 | LANG Y, ZHENG D. An improved sobel edge detection operator[C]// Proceedings of the 2016 6th International Conference on Mechatronics, Computer and Education Informationization. Paris: Atlantis Press, 2016: 590-593. 10.2991/mcei-16.2016.123 |
4 | PHAM D L, XU C, PRINCE J L. Current methods in medical image segmentation[J]. Annual Review of Biomedical Engineering, 2000, 2(1): 315-337. 10.1146/annurev.bioeng.2.1.315 |
5 | SHEIKH Y A, KHAN E A, KANADE T. Mode-seeking by medoidshifts[C]// Proceedings of the 2007 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2007: 1-8. 10.1109/iccv.2007.4408978 |
6 | ROTHER C, KOLMOGOROV V, BLAKE A. "GrabCut" interactive foreground extraction using iterated graph cuts[J]. ACM Transactions on Graphics, 2004, 23(3): 309-314. 10.1145/1015706.1015720 |
7 | LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation[C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2015: 3431-3440. 10.1109/cvpr.2015.7298965 |
8 | KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6): 84-90. 10.1145/3065386 |
9 | SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition [EB/OL]. [2015-04-10]. . 10.5244/c.28.6 |
10 | SZEGEDY C, LIU W, JIA Y, et al. Going deeper with convolutions[C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2015: 1-9. 10.1109/cvpr.2015.7298594 |
11 | BADRINARAYANAN V, KENDALL A, CIPOLLA R. SegNet: a deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12): 2481-2495. 10.1109/tpami.2016.2644615 |
12 | ZHAO H, SHI J, QI X, et al. Pyramid scene parsing network[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 2881-2890. 10.1109/cvpr.2017.660 |
13 | HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 770-778. 10.1109/cvpr.2016.90 |
14 | COUPRIE C, FARABET C, NAJMAN L, et al. Indoor semantic segmentation using depth information [EB/OL]. [2013-05-14]. . 10.1109/icip.2013.6738875 |
15 | ZHANG Z. Microsoft Kinect sensor and its effect[J]. IEEE Multimedia, 2012, 19(2): 4-10. 10.1109/mmul.2012.24 |
16 | HE Y, CHIU W C, KEUPER M, et al. STD2P: RGB-D semantic segmentation using spatio-temporal data-driven pooling[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 4837-4846. 10.1109/cvpr.2017.757 |
17 | EIGEN D, FERGUS R. Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture[C]// Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2015: 2650-2658. 10.1109/iccv.2015.304 |
18 | JIANG J, ZHENG L, LUO F, et al. RedNet: residual encoder-decoder network for indoor RGB-D semantic segmentation [EB/OL]. [2018-08-06]. . |
19 | GUPTA S, GIRSHICK R, ARBELÁEZ P, et al. Learning rich features from RGB-D images for object detection and segmentation[C]// Proceedings of the 2014 European Conference on Computer Vision. Cham: Springer, 2014: 345-360. 10.1007/978-3-319-10584-0_23 |
20 | HAZIRBAS C, MA L, DOMOKOS C, et al. FuseNet: incorporating depth into semantic segmentation via fusion-based CNN architecture[C]// Proceedings of the 2016 Asian Conference on Computer Vision. Cham: Springer, 2016: 213-228. 10.1007/978-3-319-54181-5_14 |
21 | HU X, YANG K, FEI L, et al. ACNet: attention based network to exploit complementary features for RGB-D semantic segmentation[C]// Proceedings of the 2019 IEEE International Conference on Image Processing. Piscataway: IEEE, 2019: 1440-1444. 10.1109/icip.2019.8803025 |
22 | SONG S, LICHTENBERG S P, XIAO J. SUN RGB-D: a RGB-D scene understanding benchmark suite[C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2015: 567-576. 10.1109/cvpr.2015.7298655 |
23 | SILBERMAN N, HOIEM D, KOHLI P, et al. Indoor segmentation and support inference from RGBD images[C]// Proceedings of the 2012 European Conference on Computer Vision. Cham: Springer, 2012: 746-760. 10.1007/978-3-642-33715-4_54 |
24 | PASZKE A, CHAURASIA A, KIM S, et al. ENet: a deep neural network architecture for real-time semantic segmentation [EB/OL]. [2016-06-07]. . 10.1109/icsip49896.2020.9339426 |
25 | WANG Y, ZHOU Q, LIU J, et al. LEDNet: a lightweight encoder-decoder network for real-time semantic segmentation[C]// Proceedings of the 2019 IEEE International Conference on Image Processing. Piscataway: IEEE, 2019: 1860-1864. 10.1109/icip.2019.8803154 |
26 | VISIN F, CICCONE M, ROMERO A, et al. ReSeg: a recurrent neural network-based model for semantic segmentation[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops. Piscataway: IEEE, 2016: 41-48. 10.1109/cvprw.2016.60 |
27 | CHO K, MERRIËNBOER B VAN, GULCEHRE C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation [EB/OL]. [2014-11-03]. . 10.3115/v1/d14-1179 |
28 | BYEON W, BREUEL T M, RAUE F, et al. Scene labeling with LSTM recurrent neural networks[C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2015: 3547-3555. 10.1109/cvpr.2015.7298977 |
29 | LIPTON Z C, BERKOWITZ J, ELKAN C. A critical review of recurrent neural networks for sequence learning [EB/OL]. [2015-10-17]. . |
30 | MNIH V, HEESS N, GRAVES A, et al. Recurrent models of visual attention [EB/OL]. [2014-06-24]. . |
31 | FU J, LIU J, TIAN H, et al. Dual attention network for scene segmentation[C]// Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 3146-3154. 10.1109/cvpr.2019.00326 |
32 | HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]// Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 7132-7141. 10.1109/cvpr.2018.00745 |
33 | HUANG Z, WANG X, HUANG L, et al. CCNet: criss-cross attention for semantic segmentation[C]// Proceedings of the 2019 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2019: 603-612. 10.1109/iccv.2019.00069 |
34 | WANG F, JIANG M, QIAN C, et al. Residual attention network for image classification[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 3156-3164. 10.1109/cvpr.2017.683 |
35 | ZHANG K, ZHANG Z, LI Z, et al. Joint face detection and alignment using multitask cascaded convolutional networks[J]. IEEE Signal Processing Letters, 2016, 23(10): 1499-1503. 10.1109/lsp.2016.2603342 |
36 | FU C, FAN Q, MALLINAR N, et al. Big-little net: an efficient multi-scale feature representation for visual and speech recognition [EB/OL]. [2019-07-31]. . |
37 | CHEN L C, PAPANDREOU G, KOKKINOS I, et al. Semantic image segmentation with deep convolutional nets and fully connected CRFS [EB/OL]. [2016-06-07]. . 10.1109/tpami.2017.2699184 |
38 | CHEN L C, PAPANDREOU G, KOKKINOS I, et al. DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 40(4): 834-848. 10.1109/tpami.2017.2699184 |
39 | LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 2117-2125. 10.1109/cvpr.2017.106 |
40 | PASZKE A, GROSS S, CHINTALA S, et al. Automatic differentiation in pytorch [EB/OL]. [2017-10-28]. . |
41 | JIANG J, ZHANG Z, HUANG Y, et al. Incorporating depth into both CNN and CRF for indoor semantic segmentation[C]// Proceedings of the 2017 8th IEEE International Conference on Software Engineering and Service Science. Piscataway: IEEE, 2017: 525-530. 10.1109/icsess.2017.8342970 |
42 | LEE S, PARK S J, HONG K S. RDFNet: RGB-D multi-level residual feature fusion for indoor semantic segmentation[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 4990-4999. 10.1109/iccv.2017.533 |
43 | QI X, LIAO R, JIA J, et al. 3D graph neural networks for RGBD semantic segmentation[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 5199-5208. 10.1109/iccv.2017.556 |
44 | LIN D, CHEN G, COHEN-OR D, et al. Cascaded feature network for semantic segmentation of RGB-D images[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 1311-1319. 10.1109/iccv.2017.147 |
45 | WANG W, NEUMANN U. Depth-aware CNN for RGB-D segmentation[C]// Proceedings of the 2018 European Conference on Computer Vision. Cham: Springer, 2018: 135-150. 10.1007/978-3-030-01252-6_9 |
46 | 杨胜杰,仇振安,高小宁,等.基于深度敏感空间金字塔池化的RGBD语义分割[J].电光与控制,2020,27(12):84-89. 10.3969/j.issn.1671-637X.2020.12.018 |
YANG S J, QIU Z A, GAO X N, et al. RGBD semantic segmentation based on depth-sensitive spatial pyramid pooling[J]. Electronics Optics and Control, 2020, 27(12): 84-89. 10.3969/j.issn.1671-637X.2020.12.018 | |
47 | CHEN L Z, LIN Z, WANG Z, et al. Spatial information guided convolution for real-time RGBD semantic segmentation[J]. IEEE Transactions on Image Processing, 2021, 30: 2313-2324. 10.1109/tip.2021.3049332 |
48 | CHENG Y, CAI R, LI Z, et al. Locality-sensitive deconvolution networks with gated fusion for RGB-D indoor semantic segmentation[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 3029-3037. 10.1109/cvpr.2017.161 |
49 | GAO X, CAI M, LI J. Improved RGBD semantic segmentation using multi-scale features[C]// Proceedings of the 2018 Chinese Control and Decision Conference. Piscataway: IEEE, 2018: 3531-3536. 10.1109/ccdc.2018.8407734 |
50 | LIN D, ZHANG R, JI Y, et al. SCN: switchable context network for semantic segmentation of RGB-D images[J]. IEEE Transactions on Cybernetics, 2018, 50(3): 1120-1131. 10.1109/TCYB.2018.2885062 |
51 | XING Y, WANG J, CHEN X, et al. Coupling two-stream RGB-D semantic segmentation network by idempotent mappings[C]// Proceedings of the 2019 IEEE International Conference on Image Processing. Piscataway: IEEE, 2019: 1850-1854. 10.1109/icip.2019.8803146 |
52 | ZHOU W, YUAN J, LEI J, et al. TSNet: three-stream self-attention network for RGB-D indoor semantic segmentation [EB/OL]. [2020-06-10]. . 10.1109/mis.2020.2999462 |
[1] | 罗圣钦, 陈金怡, 李洪均. 基于注意力机制的多尺度残差UNet实现乳腺癌灶分割[J]. 《计算机应用》唯一官方网站, 2022, 42(3): 818-824. |
[2] | 朱文球, 邹广, 曾志高. 融合层次特征和混合注意力的目标跟踪算法[J]. 《计算机应用》唯一官方网站, 2022, 42(3): 833-843. |
[3] | 孟杰, 王莉, 杨延杰, 廉飚. 基于多模态深度融合的虚假信息检测[J]. 《计算机应用》唯一官方网站, 2022, 42(2): 419-425. |
[4] | 潘仁志, 钱付兰, 赵姝, 张燕平. 基于卷积神经网络交互的用户属性偏好建模的推荐模型[J]. 《计算机应用》唯一官方网站, 2022, 42(2): 404-411. |
[5] | 李亚鸣, 邢凯, 邓洪武, 王志勇, 胡璇. 基于小样本无梯度学习的卷积结构预训练模型性能优化方法[J]. 《计算机应用》唯一官方网站, 2022, 42(2): 365-374. |
[6] | 张毅, 王爽胜, 何彬, 叶培明, 李克强. 基于BERT的初等数学文本命名实体识别方法[J]. 《计算机应用》唯一官方网站, 2022, 42(2): 433-439. |
[7] | 刘羽茜, 刘玉奇, 张宗霖, 卫志华, 苗冉. 注入注意力机制的深度特征融合新闻推荐模型[J]. 《计算机应用》唯一官方网站, 2022, 42(2): 426-432. |
[8] | 吕学强, 彭郴, 张乐, 董志安, 游新冬. 融合BERT与标签语义注意力的文本多标签分类方法[J]. 《计算机应用》唯一官方网站, 2022, 42(1): 57-63. |
[9] | 杨贞, 彭小宝, 朱强强, 殷志坚. 基于Deeplab V3 Plus的自适应注意力机制图像分割算法[J]. 《计算机应用》唯一官方网站, 2022, 42(1): 230-238. |
[10] | 王润泽, 张月琴, 秦琪琦, 张泽华, 郭旭敏. 多视角多注意力融合分子特征的药物-靶标亲和力预测[J]. 《计算机应用》唯一官方网站, 2022, 42(1): 325-332. |
[11] | 代雨柔, 杨庆, 张凤荔, 周帆. 基于自监督学习的社交网络用户轨迹预测模型[J]. 计算机应用, 2021, 41(9): 2545-2551. |
[12] | 刘雅璇, 钟勇. 基于头实体注意力的实体关系联合抽取方法[J]. 计算机应用, 2021, 41(9): 2517-2522. |
[13] | 李康康, 张静. 基于注意力机制的多层次编码和解码的图像描述模型[J]. 计算机应用, 2021, 41(9): 2504-2509. |
[14] | 赵宏, 孔东一. 图像特征注意力与自适应注意力融合的图像内容中文描述[J]. 计算机应用, 2021, 41(9): 2496-2503. |
[15] | 党伟超, 李涛, 白尚旺, 高改梅, 刘春霞. 基于自注意力长短期记忆网络的Web软件系统实时剩余寿命预测方法[J]. 计算机应用, 2021, 41(8): 2346-2351. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||