《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (3): 713-722.DOI: 10.11772/j.issn.1001-9081.2022020245
所属专题: 人工智能
王泽宇1(), 布树辉2, 黄伟1, 郑远攀1, 吴庆岗1, 张旭1
收稿日期:
2022-03-02
修回日期:
2022-06-09
接受日期:
2022-06-14
发布日期:
2022-08-16
出版日期:
2023-03-10
通讯作者:
王泽宇
作者简介:
王泽宇(1989—),男,河南郑州人,讲师,博士,主要研究方向:深度学习、计算机视觉基金资助:
Zeyu WANG1(), Shuhui BU2, Wei HUANG1, Yuanpan ZHENG1, Qinggang WU1, Xu ZHANG1
Received:
2022-03-02
Revised:
2022-06-09
Accepted:
2022-06-14
Online:
2022-08-16
Published:
2023-03-10
Contact:
Zeyu WANG
About author:
WANG Zeyu, born in 1989, Ph. D., lecturer. His research interests include deep learning, computer vision.Supported by:
摘要:
为解决交通场景解析中局部和全局上下文信息自适应聚合的问题,提出3模块架构的局部和全局上下文注意力融合网络(LGCAFN)。前端的特征提取模块由基于串联空洞空间金字塔池化(CASPP)单元改进的ResNet-101组成,能够更加有效地提取物体的多尺度局部特征;中端的结构化学习模块由8路长短期记忆(LSTM)网络分支组成,可以更加准确地推理物体邻近8个不同方向上场景区域的空间结构化特征;后端的特征融合模块采用基于注意力机制的3阶段融合方式,能够自适应地聚合有用的上下文信息并屏蔽噪声上下文信息,且生成的多模态融合特征能够更加全面且准确地表示物体的语义信息。在Cityscapes标准和扩展数据集上的实验结果表明,相较于逆变换网络(ITN)和对象上下文表示网络(OCRN)等方法,LGCAFN实现了最优的平均交并比(mIoU),达到了84.0%和86.3%,表明LGCAFN能够准确地解析交通场景,有助于实现车辆自动驾驶。
中图分类号:
王泽宇, 布树辉, 黄伟, 郑远攀, 吴庆岗, 张旭. 面向交通场景解析的局部和全局上下文注意力融合网络[J]. 计算机应用, 2023, 43(3): 713-722.
Zeyu WANG, Shuhui BU, Wei HUANG, Yuanpan ZHENG, Qinggang WU, Xu ZHANG. Local and global context attentive fusion network for traffic scene parsing[J]. Journal of Computer Applications, 2023, 43(3): 713-722.
方法 | 主干网络 | 扩展数据集 | 马路 | 人行道 | 建筑 | 墙 | 围栏 | 杆 | 信号灯 | 交通标识 | 植物 | 地面 |
---|---|---|---|---|---|---|---|---|---|---|---|---|
CPN | ResNet-101 | — | — | — | — | — | — | — | — | — | — | — |
SPBGRN | ResNet-101 | — | 98.7 | 86.9 | 93.6 | 57.6 | 62.8 | 70.3 | 78.7 | 81.7 | 93.8 | 72.4 |
SCARN | ResNet-101 | — | — | — | — | — | — | — | — | — | — | — |
SBEPN | ResNet-101 | — | — | — | — | — | — | — | — | — | — | — |
STLN | ResNet-101 | — | — | — | — | — | — | — | — | — | — | — |
GPN | ResNet-101 | — | 98.8 | 87.8 | 93.8 | 61.8 | 63.3 | 70.4 | 78.9 | 81.7 | 94.0 | 72.4 |
CEN | ResNet-101 | — | 98.8 | 89.1 | 94.6 | 62.7 | 63.7 | 66.4 | 75.7 | 79.7 | 94.7 | 73.6 |
CAAN | ResNet-101 | — | — | — | — | — | — | — | — | — | — | — |
RCAN | HRNet-W48 | — | — | — | — | — | — | — | — | — | — | — |
OCRN | HRNet-W48 | — | 98.8 | 88.2 | 94.2 | 67.6 | 65.3 | 72.1 | 79.0 | 82.3 | 94.1 | 73.8 |
LGCAFN | ResNet-101 | — | 98.9 | 88.9 | 94.0 | 66.8 | 66.5 | 73.6 | 79.6 | 82.3 | 94.2 | 73.8 |
SWRN | SWideRNet-(1,1,4.5) | | 98.8 | 88.4 | 94.6 | 68.2 | 68.6 | 76.0 | 81.2 | 84.7 | 94.3 | 74.1 |
HMAN | HRNet-W48 | | 98.9 | 89.3 | 94.9 | 71.8 | 68.3 | 75.8 | 82.1 | 85.2 | 94.4 | 74.9 |
ITN | HRNet-W48 | | 98.8 | 89.6 | 94.8 | 71.7 | 69.1 | 75.7 | 82.2 | 85.4 | 94.2 | 74.9 |
LGCAFN | ResNet-101 | | 99.0 | 89.3 | 95.0 | 73.4 | 72.3 | 76.3 | 82.5 | 86.3 | 94.7 | 75.6 |
方法 | 主干网络 | 扩展数据集 | 天空 | 行人 | 骑手 | 汽车 | 卡车 | 公交车 | 火车 | 摩托车 | 自行车 | 平均 |
CPN | ResNet-101 | — | — | — | — | — | — | — | — | — | — | 81.3 |
SPBGRN | ResNet-101 | — | 95.6 | 88.1 | 74.5 | 96.2 | 73.6 | 88.8 | 86.3 | 72.1 | 79.2 | 81.6 |
SCARN | ResNet-101 | — | — | — | — | — | — | — | — | — | — | 82.1 |
SBEPN | ResNet-101 | — | — | — | — | — | — | — | — | — | — | 82.2 |
STLN | ResNet-101 | — | — | — | — | — | — | — | — | — | — | 82.3 |
GPN | ResNet-101 | — | 95.9 | 88.2 | 74.8 | 96.4 | 80.4 | 91.1 | 85.4 | 72.0 | 78.6 | 82.5 |
CEN | ResNet-101 | — | 96.4 | 87.3 | 75.4 | 94.2 | 79.4 | 91.9 | 86.8 | 73.3 | 79.7 | 82.5 |
CAAN | ResNet-101 | — | — | — | — | — | — | — | — | — | — | 82.6 |
RCAN | HRNet-W48 | — | — | — | — | — | — | — | — | — | — | 82.7 |
OCRN | HRNet-W48 | — | 95.9 | 88.1 | 74.9 | 96.3 | 76.8 | 92.2 | 90.8 | 72.8 | 78.8 | |
LGCAFN | ResNet-101 | — | 95.6 | 88.9 | 77.3 | 95.2 | 81.0 | 93.3 | 89.3 | 75.6 | 80.6 | |
SWRN | SWideRNet-(1,1,4.5) | | 96.2 | 89.7 | 79.7 | 96.7 | 82.0 | 94.1 | 92.1 | 77.1 | 79.2 | 85.1 |
HMAN | HRNet-W48 | | 96.3 | 90.1 | 79.7 | 96.9 | 82.5 | 94.6 | 87.8 | 77.1 | 81.7 | 85.4 |
ITN | HRNet-W48 | | 96.2 | 90.2 | 79.8 | 96.9 | 84.3 | 95.7 | 90.5 | 77.1 | 81.6 | 85.7 |
LGCAFN | ResNet-101 | | 96.0 | 90.5 | 80.4 | 97.0 | 84.2 | 94.6 | 91.1 | 78.9 | 82.4 | 86.3 |
表1 Cityscapes数据集上不同方法的mIoU结果 (%)
Tab. 1 mIoU results of LGCAFN and existing state-of-the-art methods on Cityscapes dataset
方法 | 主干网络 | 扩展数据集 | 马路 | 人行道 | 建筑 | 墙 | 围栏 | 杆 | 信号灯 | 交通标识 | 植物 | 地面 |
---|---|---|---|---|---|---|---|---|---|---|---|---|
CPN | ResNet-101 | — | — | — | — | — | — | — | — | — | — | — |
SPBGRN | ResNet-101 | — | 98.7 | 86.9 | 93.6 | 57.6 | 62.8 | 70.3 | 78.7 | 81.7 | 93.8 | 72.4 |
SCARN | ResNet-101 | — | — | — | — | — | — | — | — | — | — | — |
SBEPN | ResNet-101 | — | — | — | — | — | — | — | — | — | — | — |
STLN | ResNet-101 | — | — | — | — | — | — | — | — | — | — | — |
GPN | ResNet-101 | — | 98.8 | 87.8 | 93.8 | 61.8 | 63.3 | 70.4 | 78.9 | 81.7 | 94.0 | 72.4 |
CEN | ResNet-101 | — | 98.8 | 89.1 | 94.6 | 62.7 | 63.7 | 66.4 | 75.7 | 79.7 | 94.7 | 73.6 |
CAAN | ResNet-101 | — | — | — | — | — | — | — | — | — | — | — |
RCAN | HRNet-W48 | — | — | — | — | — | — | — | — | — | — | — |
OCRN | HRNet-W48 | — | 98.8 | 88.2 | 94.2 | 67.6 | 65.3 | 72.1 | 79.0 | 82.3 | 94.1 | 73.8 |
LGCAFN | ResNet-101 | — | 98.9 | 88.9 | 94.0 | 66.8 | 66.5 | 73.6 | 79.6 | 82.3 | 94.2 | 73.8 |
SWRN | SWideRNet-(1,1,4.5) | | 98.8 | 88.4 | 94.6 | 68.2 | 68.6 | 76.0 | 81.2 | 84.7 | 94.3 | 74.1 |
HMAN | HRNet-W48 | | 98.9 | 89.3 | 94.9 | 71.8 | 68.3 | 75.8 | 82.1 | 85.2 | 94.4 | 74.9 |
ITN | HRNet-W48 | | 98.8 | 89.6 | 94.8 | 71.7 | 69.1 | 75.7 | 82.2 | 85.4 | 94.2 | 74.9 |
LGCAFN | ResNet-101 | | 99.0 | 89.3 | 95.0 | 73.4 | 72.3 | 76.3 | 82.5 | 86.3 | 94.7 | 75.6 |
方法 | 主干网络 | 扩展数据集 | 天空 | 行人 | 骑手 | 汽车 | 卡车 | 公交车 | 火车 | 摩托车 | 自行车 | 平均 |
CPN | ResNet-101 | — | — | — | — | — | — | — | — | — | — | 81.3 |
SPBGRN | ResNet-101 | — | 95.6 | 88.1 | 74.5 | 96.2 | 73.6 | 88.8 | 86.3 | 72.1 | 79.2 | 81.6 |
SCARN | ResNet-101 | — | — | — | — | — | — | — | — | — | — | 82.1 |
SBEPN | ResNet-101 | — | — | — | — | — | — | — | — | — | — | 82.2 |
STLN | ResNet-101 | — | — | — | — | — | — | — | — | — | — | 82.3 |
GPN | ResNet-101 | — | 95.9 | 88.2 | 74.8 | 96.4 | 80.4 | 91.1 | 85.4 | 72.0 | 78.6 | 82.5 |
CEN | ResNet-101 | — | 96.4 | 87.3 | 75.4 | 94.2 | 79.4 | 91.9 | 86.8 | 73.3 | 79.7 | 82.5 |
CAAN | ResNet-101 | — | — | — | — | — | — | — | — | — | — | 82.6 |
RCAN | HRNet-W48 | — | — | — | — | — | — | — | — | — | — | 82.7 |
OCRN | HRNet-W48 | — | 95.9 | 88.1 | 74.9 | 96.3 | 76.8 | 92.2 | 90.8 | 72.8 | 78.8 | |
LGCAFN | ResNet-101 | — | 95.6 | 88.9 | 77.3 | 95.2 | 81.0 | 93.3 | 89.3 | 75.6 | 80.6 | |
SWRN | SWideRNet-(1,1,4.5) | | 96.2 | 89.7 | 79.7 | 96.7 | 82.0 | 94.1 | 92.1 | 77.1 | 79.2 | 85.1 |
HMAN | HRNet-W48 | | 96.3 | 90.1 | 79.7 | 96.9 | 82.5 | 94.6 | 87.8 | 77.1 | 81.7 | 85.4 |
ITN | HRNet-W48 | | 96.2 | 90.2 | 79.8 | 96.9 | 84.3 | 95.7 | 90.5 | 77.1 | 81.6 | 85.7 |
LGCAFN | ResNet-101 | | 96.0 | 90.5 | 80.4 | 97.0 | 84.2 | 94.6 | 91.1 | 78.9 | 82.4 | 86.3 |
方法 | 主干网络 | 参数量/106 | 浮点运算量/ GFLOPs | mIoU/% |
---|---|---|---|---|
SWRN | SWideRNet-(1,1,4.5) | 168.77 | 680.7 | 85.1 |
OCRN | HRNet-W48 | 67.25 | 410.6 | 83.3 |
CEN | ResNet-101 | 92.80 | 286.1 | 82.5 |
ITN | HRNet-W48 | 69.00 | 253.3 | 85.7 |
LGCAFN | ResNet-101 | 65.75 | 228.9 | 86.3 |
表2 在Cityscapes数据集的模型复杂度对比
Tab. 2 Model complexity comparison on Cityscapes dataset
方法 | 主干网络 | 参数量/106 | 浮点运算量/ GFLOPs | mIoU/% |
---|---|---|---|---|
SWRN | SWideRNet-(1,1,4.5) | 168.77 | 680.7 | 85.1 |
OCRN | HRNet-W48 | 67.25 | 410.6 | 83.3 |
CEN | ResNet-101 | 92.80 | 286.1 | 82.5 |
ITN | HRNet-W48 | 69.00 | 253.3 | 85.7 |
LGCAFN | ResNet-101 | 65.75 | 228.9 | 86.3 |
模型 | mIoU |
---|---|
Baseline | 77.6 |
Baseline+CASPP | 80.4 |
Baseline+CASPP+LSTM | 82.8 |
Baseline+CASPP+LSTM+Attention | 84.0 |
表3 Cityscapes数据集上的消融学习 ( %)
Tab. 3 Ablation study on Cityscapes dataset
模型 | mIoU |
---|---|
Baseline | 77.6 |
Baseline+CASPP | 80.4 |
Baseline+CASPP+LSTM | 82.8 |
Baseline+CASPP+LSTM+Attention | 84.0 |
方法 | ResNet-101(r1, r2, r3, r4, r5) | mIoU |
---|---|---|
1 | ResNet-101(1,(1,1,1),(1,1,1,1),(1,1_6,1_4,1_4,1_4,1_4),(1,1,1)) | 77.6 |
ResNet-101(2,(2,2,2),(2,2,2,2),(2,2_6,2_4,2_4,2_4,2_4),(2,2,2)) | 78.3 | |
ResNet-101(4,(4,4,4),(4,4,4,4),(4,4_6,4_4,4_4,4_4,4_4),(4,4,4)) | 78.6 | |
ResNet-101(8,(8,8,8),(8,8,8,8),(8,8_6,8_4,8_4,8_4,8_4),(8,8,8)) | 78.9 | |
ResNet-101(16,(16,16,16),(16,16,16,16),(16,16_6,16_4,16_4,16_4,16_4),(16,16,16)) | 77.8 | |
ResNet-101(24,(24,24,24),(24,24,24,24),(24,24_6,24_4,24_4,24_4,24_4),(24,24,24)) | 76.9 | |
2 | ResNet-101(2,(4,4,4),(8,8,8,8),(8,8_6,8_4,8_4,8_4,8_4),(16,16,16)) | 79.5 |
3 | ResNet-101(2,(2,4,8),(2,4,8,16),(2,4_6,8_4,8_4,16_4,24_4),(4,8, 6)) | 80.4 |
表4 Cityscapes数据集上特征提取模块的稀疏采样率设置学习 ( %)
Tab. 4 Sparse sampling rate setting study of feature extraction module on Cityscapes dataset
方法 | ResNet-101(r1, r2, r3, r4, r5) | mIoU |
---|---|---|
1 | ResNet-101(1,(1,1,1),(1,1,1,1),(1,1_6,1_4,1_4,1_4,1_4),(1,1,1)) | 77.6 |
ResNet-101(2,(2,2,2),(2,2,2,2),(2,2_6,2_4,2_4,2_4,2_4),(2,2,2)) | 78.3 | |
ResNet-101(4,(4,4,4),(4,4,4,4),(4,4_6,4_4,4_4,4_4,4_4),(4,4,4)) | 78.6 | |
ResNet-101(8,(8,8,8),(8,8,8,8),(8,8_6,8_4,8_4,8_4,8_4),(8,8,8)) | 78.9 | |
ResNet-101(16,(16,16,16),(16,16,16,16),(16,16_6,16_4,16_4,16_4,16_4),(16,16,16)) | 77.8 | |
ResNet-101(24,(24,24,24),(24,24,24,24),(24,24_6,24_4,24_4,24_4,24_4),(24,24,24)) | 76.9 | |
2 | ResNet-101(2,(4,4,4),(8,8,8,8),(8,8_6,8_4,8_4,8_4,8_4),(16,16,16)) | 79.5 |
3 | ResNet-101(2,(2,4,8),(2,4,8,16),(2,4_6,8_4,8_4,16_4,24_4),(4,8, 6)) | 80.4 |
方法 | mIoU |
---|---|
LSTM(↓,↑,→,←) | 82.3 |
LSTM(↘,↖,↙,↗) | 81.6 |
LSTM(↓,↑,→,←,↘,↖,↙,↗) | 82.8 |
表5 不同LSTM遍历方式对性能的影响 ( %)
Tab. 5 Effect of different LSTM traversal methods on performance
方法 | mIoU |
---|---|
LSTM(↓,↑,→,←) | 82.3 |
LSTM(↘,↖,↙,↗) | 81.6 |
LSTM(↓,↑,→,←,↘,↖,↙,↗) | 82.8 |
方法 | mIoU |
---|---|
Concatenation | 83.1 |
Element-wise addition | 83.3 |
Attention mechanism | 84.0 |
表6 不同融合方式对性能的影响 ( %)
Tab. 6 Effect of different fusion methods on performance
方法 | mIoU |
---|---|
Concatenation | 83.1 |
Element-wise addition | 83.3 |
Attention mechanism | 84.0 |
1 | MO Y J, WU Y, YANG X N, et al. Review the state-of-the-art technologies of semantic segmentation based on deep learning[J]. Neurocomputing, 2022, 493: 626-646. 10.1016/j.neucom.2022.01.005 |
2 | AGIA C, JATAVALLABHULA K M, KHODEIR M, et al. Taskography: evaluating robot task planning over large 3D scene graphs[C]// Proceedings of the 5th Conference on Robot Learning. New York: JMLR.org, 2022: 46-58. |
3 | CAESAR H, BANKITI V, LANG A H, et al. nuScenes: a multimodal dataset for autonomous driving[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 11618-11628. 10.1109/cvpr42600.2020.01164 |
4 | YU C, LIU Z X, LIU X J, et al. DS-SLAM: a semantic visual SLAM towards dynamic environments[C]// Proceedings of the 2018 IEEE/RSJ Conference on Intelligent Robots and Systems. Piscataway: IEEE, 2018: 1168-1174. 10.1109/iros.2018.8593691 |
5 | LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation[C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2015: 3431-3440. 10.1109/cvpr.2015.7298965 |
6 | NGUYEN K, FOOKES C, SRIDHARAN S. Context from within: Hierarchical context modeling for semantic segmentation[J]. Pattern Recognition, 2020, 105: No.107358. 10.1016/j.patcog.2020.107358 |
7 | ZHANG R M, YANG W, PENG Z L, et al. Progressively diffused networks for semantic visual parsing[J]. Pattern Recognition, 2019, 90: 78-86. 10.1016/j.patcog.2019.01.011 |
8 | ZHAO H S, SHI J P, QI X J, et al. Pyramid scene parsing network[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 6230-6239. 10.1109/cvpr.2017.660 |
9 | YANG M K, YU K, ZHANG C, et al. DenseASPP for semantic segmentation in street scenes[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 3684-3692. 10.1109/cvpr.2018.00388 |
10 | CHEN L C, ZHU Y K, PAPANDREOU G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation[C]// Proceedings of the 2018 European Conference on Computer Vision, LNCS 11211. Cham: Springer, 2018: 833-851. 10.1007/978-3-030-01234-2_49 |
11 | TAO A, SAPRA K, CATANZARO B. Hierarchical multi-scale attention for semantic segmentation[EB/OL]. (2020-05-21) [2022-01-05].. 10.48550/arXiv.2005.10821 |
12 | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook, NY: Curran Associates Inc., 2017:6000-6010. |
13 | FU J, LIU J, TIAN H J, et al. Dual attention network for scene segmentation[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 3141-3149. 10.1109/cvpr.2019.00326 |
14 | YUAN Y H, CHEN X L, WANG J D. Object-contextual representations for semantic segmentation[C]// Proceedings of the 2020 European Conference on Computer Vision, LNCS 12351. Cham: Springer, 2020: 173-190. |
15 | LI X, YANG Y B, ZHAO Q J, et al. Spatial pyramid based graph reasoning for semantic segmentation[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 8947-8956. 10.1109/cvpr42600.2020.00897 |
16 | YU C Q, WANG J B, GAO C X, et al. Context prior for scene segmentation[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 12413-12422. 10.1109/cvpr42600.2020.01243 |
17 | CHEN X, HAN Z, LIU X P, et al. Semantic boundary enhancement and position attention network with long-range dependency for semantic segmentation[J]. Applied Soft Computing, 2021, 109: No.107511. 10.1016/j.asoc.2021.107511 |
18 | DING X F, SHEN C M, CHE Z P, et al. SCARF: a semantic constrained attention refinement network for semantic segmentation[C]// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision Workshops. Piscataway: IEEE, 2021: 3002-3011. 10.1109/iccvw54120.2021.00335 |
19 | ZHANG Y, SUN X, DONG J Y, et al. GPNet: gated pyramid network for semantic segmentation[J]. Pattern Recognition, 2021, 115: No.107940. 10.1016/j.patcog.2021.107940 |
20 | HUANG Y, KANG D, JIA W J, et al. Channelized axial attention-Considering channel relation within spatial attention for semantic segmentation[C]// Proceedings of the 36th AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2022: 1016-1025. 10.1609/aaai.v36i1.19985 |
21 | LU B X, HU Q H, WANG Y, et al. RCANet: row-column attention network for semantic segmentation[C]// Proceedings of the 2022 IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway: IEEE, 2022: 2604-2608. 10.1109/icassp43922.2022.9746869 |
22 | ZHOU Q, WU X F, ZHANG S F, et al. Contextual ensemble network for semantic segmentation[J]. Pattern Recognition, 2022, 122: No.108290. 10.1016/j.patcog.2021.108290 |
23 | HUANG Y, KANG D, CHEN L, et al. CAR: class-aware regularizations for semantic segmentation[C]// Proceedings of the 2022 European Conference on Computer Vision, LNCS 13688. Cham: Springer, 2022: 518-534. |
24 | 杨贞,彭小宝,朱强强,等. 基于Deeplab V3 Plus的自适应注意力机制图像分割算法[J]. 计算机应用, 2022, 42(1): 230-238. |
YANG Z, PENG X B, ZHU Q Q, et al. Image segmentation algorithm with adaptive attention mechanism based on Deeplab V3 Plus[J]. Journal of Computer Applications, 2022, 42(1): 230-238. | |
25 | 余娜,刘彦,魏雄炬,等. 基于注意力机制和金字塔融合的RGB-D室内场景语义分割[J]. 计算机应用, 2022, 42(3): 844-853. 10.11772/j.issn.1001-9081.2021030392 |
YU N, LIU Y, WEI X J, et al. Semantic segmentation of RGB-D indoor scenes based on attention mechanism and pyramid fusion[J]. Journal of Computer Applications, 2022, 42(3): 844-853. 10.11772/j.issn.1001-9081.2021030392 | |
26 | 段立娟,孙启超,乔元华,等. 基于注意力感知和语义感知的RGB-D室内图像语义分割算法[J]. 计算机学报, 2021, 44(2): 275-291. 10.11897/SP.J.1016.2021.00275 |
DUAN L J, SUN Q C, QIAO Y H, et al. Attention-aware and semantic-aware network for RGB-D indoor semantic segmentation[J]. Chinese Journal of Computers, 2021, 44(2): 275-291. 10.11897/SP.J.1016.2021.00275 | |
27 | 吴绿,张馨月,唐茉,等. Focus+Context语义表征的场景图像分割[J]. 电子学报, 2021, 49(3): 596-604. |
WU L, ZHANG X Y, TANG M, et al. Focus+Context semantic representation in scene segmentation[J]. Acta Electronica Sinica, 2021, 49(3): 596-604. | |
28 | 黄庭鸿,聂卓赟,王庆国,等. 基于区块自适应特征融合的图像实时语义分割[J]. 自动化学报, 2021, 47(5): 1137-1148. |
HUANG T H, NIE Z Y, WANG Q G, et al. Real-time image semantic segmentation based on block adaptive feature fusion[J]. Acta Automatica Sinica, 2021, 47(5): 1137-1148. | |
29 | ZHU L Y, JI D Y, ZHU S P, et al. Learning statistical texture for semantic segmentation[C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 12532-12541. 10.1109/cvpr46437.2021.01235 |
30 | CHEN L C, WANG H Y, QIAO S Y. Scaling wide residual networks for panoptic segmentation[EB/OL]. (2021-02-08) [2022-01-21].. |
31 | BORSE S, WANG Y, ZHANG Y Z, et al. InverseForm: a loss function for structured boundary-aware segmentation[C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 5897-5907. 10.1109/cvpr46437.2021.00584 |
32 | HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 770-778. 10.1109/cvpr.2016.90 |
33 | CORDTS M, OMRAN M, RAMOS S, et al. The Cityscapes dataset for semantic urban scene understanding[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 3213-3223. 10.1109/cvpr.2016.350 |
34 | ABADI M, AGARWAL A, BARHAM P, et al. TensorfFlow: large-scale machine learning on heterogeneous distributed systems[EB/OL]. (2016-03-16) [2021-11-16].. |
35 | LeCUN Y, BOTTOU L, ORR G B, et al. Efficient backprop[M]// ORR G B, MÜLLER K R. Neural Networks: Tricks of the Trade, LNCS 1524. Berlin: Springer, 1998: 9-50. |
36 | ZINKEVICH M, WEIMER M, LI L, et al. Parallelized stochastic gradient descent[C]// Proceedings of the 23rd International Conference on Neural Information Processing Systems - Volume 2. Red Hook, NY: Curran Associates Inc., 2010: 2595-2603. |
37 | SUN K, XIAO B, LIU D, et al. Deep high-resolution representation learning for human pose estimation[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 5693-5703. 10.1109/cvpr.2019.00584 |
[1] | 汪雨晴, 朱广丽, 段文杰, 李书羽, 周若彤. 基于交互注意力机制的心理咨询文本情感分类模型[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2393-2399. |
[2] | 陈彤, 杨丰玉, 熊宇, 严荭, 邱福星. 基于多尺度频率通道注意力融合的声纹库构建方法[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2407-2413. |
[3] | 田润泽, 周宇龙, 朱洪, 薛岗. 基于局部信息的服务迁移路径选择算法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2168-2174. |
[4] | 徐泽鑫, 杨磊, 李康顺. 较短的长序列时间序列预测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1824-1831. |
[5] | 吕锡婷, 赵敬华, 荣海迎, 赵嘉乐. 基于Transformer和关系图卷积网络的信息传播预测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1760-1766. |
[6] | 罗歆然, 李天瑞, 贾真. 基于自注意力机制与词汇增强的中文医学命名实体识别[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 385-392. |
[7] | 花晓雨, 李冬芬, 付优, 毕可骏, 应时, 王瑞锦. 结合层次图神经网络与长短期记忆的产业链风险评估预警模型[J]. 《计算机应用》唯一官方网站, 2024, 44(10): 3223-3231. |
[8] | 朱志平, 杨燕, 王杰. 基于场景图感知的跨模态图像描述模型[J]. 《计算机应用》唯一官方网站, 2024, 44(1): 58-64. |
[9] | 陈丽安, 过弋. 融合个体偏差信息的文本情感分析模型[J]. 《计算机应用》唯一官方网站, 2024, 44(1): 145-151. |
[10] | 史含笑, 王雷春. 结合LSTM和自注意力机制的图卷积网络短期电力负荷预测[J]. 《计算机应用》唯一官方网站, 2024, 44(1): 311-317. |
[11] | 吴家皋, 章仕稳, 蒋宇栋, 刘林峰. 基于状态精细化长短期记忆和注意力机制的社交生成对抗网络用于行人轨迹预测[J]. 《计算机应用》唯一官方网站, 2023, 43(5): 1565-1570. |
[12] | 杨海宇, 郭文普, 康凯. 基于卷积长短时深度神经网络的信号调制方式识别方法[J]. 《计算机应用》唯一官方网站, 2023, 43(4): 1318-1322. |
[13] | 尹春勇, 周立文. 基于再编码的无监督时间序列异常检测模型[J]. 《计算机应用》唯一官方网站, 2023, 43(3): 804-811. |
[14] | 尹春勇, 张杨春. 基于CNN和Bi-LSTM的无监督日志异常检测模型[J]. 《计算机应用》唯一官方网站, 2023, 43(11): 3510-3516. |
[15] | 余本年, 詹永照, 毛启容, 董文龙, 刘洪麟. 面向语音增强的双复数卷积注意聚合递归网络[J]. 《计算机应用》唯一官方网站, 2023, 43(10): 3217-3224. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||