面向交通场景解析的局部和全局上下文注意力融合网络

doi:10.11772/j.issn.1001-9081.2022020245

《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (3): 713-722.DOI: 10.11772/j.issn.1001-9081.2022020245

所属专题：人工智能

面向交通场景解析的局部和全局上下文注意力融合网络

王泽宇¹(), 布树辉², 黄伟¹, 郑远攀¹, 吴庆岗¹, 张旭¹

^1.郑州轻工业大学计算机与通信工程学院，郑州 450002
^2.西北工业大学航空学院，西安 710072

收稿日期:2022-03-02 修回日期:2022-06-09 接受日期:2022-06-14 发布日期:2022-08-16 出版日期:2023-03-10
通讯作者: 王泽宇
作者简介:王泽宇（1989—），男，河南郑州人，讲师，博士，主要研究方向：深度学习、计算机视觉
布树辉（1978—），男，河南洛阳人，教授，博士，主要研究方向：深度学习、计算机视觉
黄伟（1982—），男，河南郑州人，副教授，博士，主要研究方向：深度学习、计算机视觉
郑远攀（1983—），男，河南郑州人，副教授，博士，主要研究方向：深度学习、计算机视觉
吴庆岗（1984—），男，河南濮阳人，副教授，博士，主要研究方向：深度学习、计算机视觉
张旭（1979—），女，河南南阳人，讲师，硕士，主要研究方向：深度学习、计算机视觉。
基金资助:
河南省科技攻关项目(222102210021);河南省高等学校重点科研项目计划支持(21A520049)

Local and global context attentive fusion network for traffic scene parsing

Zeyu WANG¹(), Shuhui BU², Wei HUANG¹, Yuanpan ZHENG¹, Qinggang WU¹, Xu ZHANG¹

^1.College of Computer and Communication Engineering，Zhengzhou University of Light Industry，Zhengzhou Henan 450002，China
^2.School of Aeronautics，Northwestern Polytechnical University，Xi’an Shaanxi 710072，China

Received:2022-03-02 Revised:2022-06-09 Accepted:2022-06-14 Online:2022-08-16 Published:2023-03-10
Contact: Zeyu WANG
About author:WANG Zeyu， born in 1989， Ph. D.， lecturer. His research interests include deep learning， computer vision.
BU Shuhui， born in 1978， Ph. D.， professor. His research interests include deep learning， computer vision.
HUANG Wei， born in 1982， Ph. D.， associate professor. His research interests include deep learning， computer vision.
ZHENG Yuanpan， born in 1983， Ph. D.， associate professor. His research interests include deep learning， computer vision.
WU Qinggang， born in 1984， Ph. D.， associate professor. His research interests include deep learning， computer vision.
ZHANG Xu， born in 1979， M. S.， lecturer. Her research interests include deep learning， computer vision.
Supported by:
Science and Technology Project of Henan Province(222102210021);Plan Support for Key Scientific Research Project of Higher Education in Henan Province(21A520049)

摘要/Abstract

摘要：

为解决交通场景解析中局部和全局上下文信息自适应聚合的问题，提出3模块架构的局部和全局上下文注意力融合网络（LGCAFN）。前端的特征提取模块由基于串联空洞空间金字塔池化（CASPP）单元改进的ResNet-101组成，能够更加有效地提取物体的多尺度局部特征；中端的结构化学习模块由8路长短期记忆（LSTM）网络分支组成，可以更加准确地推理物体邻近8个不同方向上场景区域的空间结构化特征；后端的特征融合模块采用基于注意力机制的3阶段融合方式，能够自适应地聚合有用的上下文信息并屏蔽噪声上下文信息，且生成的多模态融合特征能够更加全面且准确地表示物体的语义信息。在Cityscapes标准和扩展数据集上的实验结果表明，相较于逆变换网络（ITN）和对象上下文表示网络（OCRN）等方法，LGCAFN实现了最优的平均交并比（mIoU），达到了84.0%和86.3%，表明LGCAFN能够准确地解析交通场景，有助于实现车辆自动驾驶。

关键词: 交通场景解析, 自适应聚合, 串联空洞空间金字塔池化, 长短期记忆, 注意力融合

Abstract:

In order to solve the local and global contextual information adaptive aggregation problem in traffic scene parsing， a Local and Global Context Attentive Fusion Network （LGCAFN） with three-module architecture was proposed. The front-end feature extraction module consisted of the improved 101-layer Residual Network （ResNet-101） which was based on Cascaded Atrous Spatial Pyramid Pooling （CASPP） unit， and was able to extract object’s multi-scale local features more effectively. The mid-end structural learning module was composed of eight Long Short-Term Memory （LSTM） branches， and was able to infer spatial structural features of object’s adjacent scene regions in eight different directions more accurately. In the back-end feature fusion module， a three-stage fusion method based on attention mechanism was adopted to adaptively aggregate useful contextual information and shield from noisy contextual information， and the generated multi-modal fusion features were able to represent object’s semantic information in a more comprehensive and accurate way. Experimental results on Cityscapes standard and extended datasets demonstrate that compared to the existing state-of-the-art methods such as Inverse Transformation Network （ITN）， and Object Contextual Representation Network （OCRN）， LGCAFN achieves the best mean Intersection over Union （mIoU）， reaching 84.0% and 86.3% respectively， showing that LGCAFN can parse traffic scenes accurately and is helpful to realize autonomous driving of vehicles.

Key words: traffic scene parsing, adaptive aggregation, Cascaded Atrous Spatial Pyramid Pooling (CASPP), Long Short-Term Memory (LSTM), attentive fusion

中图分类号:

TP391.4

王泽宇, 布树辉, 黄伟, 郑远攀, 吴庆岗, 张旭. 面向交通场景解析的局部和全局上下文注意力融合网络[J]. 计算机应用, 2023, 43(3): 713-722.

Zeyu WANG, Shuhui BU, Wei HUANG, Yuanpan ZHENG, Qinggang WU, Xu ZHANG. Local and global context attentive fusion network for traffic scene parsing[J]. Journal of Computer Applications, 2023, 43(3): 713-722.

图/表 11

参考文献 37

1	MO Y J， WU Y， YANG X N， et al. Review the state-of-the-art technologies of semantic segmentation based on deep learning［J］. Neurocomputing， 2022， 493： 626-646. 10.1016/j.neucom.2022.01.005
2	AGIA C， JATAVALLABHULA K M， KHODEIR M， et al. Taskography： evaluating robot task planning over large 3D scene graphs［C］// Proceedings of the 5th Conference on Robot Learning. New York： JMLR.org， 2022： 46-58.
3	CAESAR H， BANKITI V， LANG A H， et al. nuScenes： a multimodal dataset for autonomous driving［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 11618-11628. 10.1109/cvpr42600.2020.01164
4	YU C， LIU Z X， LIU X J， et al. DS-SLAM： a semantic visual SLAM towards dynamic environments［C］// Proceedings of the 2018 IEEE/RSJ Conference on Intelligent Robots and Systems. Piscataway： IEEE， 2018： 1168-1174. 10.1109/iros.2018.8593691
5	LONG J， SHELHAMER E， DARRELL T. Fully convolutional networks for semantic segmentation［C］// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2015： 3431-3440. 10.1109/cvpr.2015.7298965
6	NGUYEN K， FOOKES C， SRIDHARAN S. Context from within： Hierarchical context modeling for semantic segmentation［J］. Pattern Recognition， 2020， 105： No.107358. 10.1016/j.patcog.2020.107358
7	ZHANG R M， YANG W， PENG Z L， et al. Progressively diffused networks for semantic visual parsing［J］. Pattern Recognition， 2019， 90： 78-86. 10.1016/j.patcog.2019.01.011
8	ZHAO H S， SHI J P， QI X J， et al. Pyramid scene parsing network［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 6230-6239. 10.1109/cvpr.2017.660
9	YANG M K， YU K， ZHANG C， et al. DenseASPP for semantic segmentation in street scenes［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 3684-3692. 10.1109/cvpr.2018.00388
10	CHEN L C， ZHU Y K， PAPANDREOU G， et al. Encoder-decoder with atrous separable convolution for semantic image segmentation［C］// Proceedings of the 2018 European Conference on Computer Vision， LNCS 11211. Cham： Springer， 2018： 833-851. 10.1007/978-3-030-01234-2_49
11	TAO A， SAPRA K， CATANZARO B. Hierarchical multi-scale attention for semantic segmentation［EB/OL］. （2020-05-21）［2022-01-05］.. 10.48550/arXiv.2005.10821
12	VASWANI A， SHAZEER N， PARMAR N， et al. Attention is all you need［C］// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2017：6000-6010.
13	FU J， LIU J， TIAN H J， et al. Dual attention network for scene segmentation［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2019： 3141-3149. 10.1109/cvpr.2019.00326
14	YUAN Y H， CHEN X L， WANG J D. Object-contextual representations for semantic segmentation［C］// Proceedings of the 2020 European Conference on Computer Vision， LNCS 12351. Cham： Springer， 2020： 173-190.
15	LI X， YANG Y B， ZHAO Q J， et al. Spatial pyramid based graph reasoning for semantic segmentation［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 8947-8956. 10.1109/cvpr42600.2020.00897
16	YU C Q， WANG J B， GAO C X， et al. Context prior for scene segmentation［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 12413-12422. 10.1109/cvpr42600.2020.01243
17	CHEN X， HAN Z， LIU X P， et al. Semantic boundary enhancement and position attention network with long-range dependency for semantic segmentation［J］. Applied Soft Computing， 2021， 109： No.107511. 10.1016/j.asoc.2021.107511
18	DING X F， SHEN C M， CHE Z P， et al. SCARF： a semantic constrained attention refinement network for semantic segmentation［C］// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision Workshops. Piscataway： IEEE， 2021： 3002-3011. 10.1109/iccvw54120.2021.00335
19	ZHANG Y， SUN X， DONG J Y， et al. GPNet： gated pyramid network for semantic segmentation［J］. Pattern Recognition， 2021， 115： No.107940. 10.1016/j.patcog.2021.107940
20	HUANG Y， KANG D， JIA W J， et al. Channelized axial attention-Considering channel relation within spatial attention for semantic segmentation［C］// Proceedings of the 36th AAAI Conference on Artificial Intelligence. Palo Alto， CA： AAAI Press， 2022： 1016-1025. 10.1609/aaai.v36i1.19985
21	LU B X， HU Q H， WANG Y， et al. RCANet： row-column attention network for semantic segmentation［C］// Proceedings of the 2022 IEEE International Conference on Acoustics， Speech and Signal Processing. Piscataway： IEEE， 2022： 2604-2608. 10.1109/icassp43922.2022.9746869
22	ZHOU Q， WU X F， ZHANG S F， et al. Contextual ensemble network for semantic segmentation［J］. Pattern Recognition， 2022， 122： No.108290. 10.1016/j.patcog.2021.108290
23	HUANG Y， KANG D， CHEN L， et al. CAR： class-aware regularizations for semantic segmentation［C］// Proceedings of the 2022 European Conference on Computer Vision， LNCS 13688. Cham： Springer， 2022： 518-534.
24	杨贞，彭小宝，朱强强，等. 基于Deeplab V3 Plus的自适应注意力机制图像分割算法［J］. 计算机应用， 2022， 42（1）： 230-238.
	YANG Z， PENG X B， ZHU Q Q， et al. Image segmentation algorithm with adaptive attention mechanism based on Deeplab V3 Plus［J］. Journal of Computer Applications， 2022， 42（1）： 230-238.
25	余娜，刘彦，魏雄炬，等. 基于注意力机制和金字塔融合的RGB-D室内场景语义分割［J］. 计算机应用， 2022， 42（3）： 844-853. 10.11772/j.issn.1001-9081.2021030392
	YU N， LIU Y， WEI X J， et al. Semantic segmentation of RGB-D indoor scenes based on attention mechanism and pyramid fusion［J］. Journal of Computer Applications， 2022， 42（3）： 844-853. 10.11772/j.issn.1001-9081.2021030392
26	段立娟，孙启超，乔元华，等. 基于注意力感知和语义感知的RGB-D室内图像语义分割算法［J］. 计算机学报， 2021， 44（2）： 275-291. 10.11897/SP.J.1016.2021.00275
	DUAN L J， SUN Q C， QIAO Y H， et al. Attention-aware and semantic-aware network for RGB-D indoor semantic segmentation［J］. Chinese Journal of Computers， 2021， 44（2）： 275-291. 10.11897/SP.J.1016.2021.00275
27	吴绿，张馨月，唐茉，等. Focus+Context语义表征的场景图像分割［J］. 电子学报， 2021， 49（3）： 596-604.
	WU L， ZHANG X Y， TANG M， et al. Focus+Context semantic representation in scene segmentation［J］. Acta Electronica Sinica， 2021， 49（3）： 596-604.
28	黄庭鸿，聂卓赟，王庆国，等. 基于区块自适应特征融合的图像实时语义分割［J］. 自动化学报， 2021， 47（5）： 1137-1148.
	HUANG T H， NIE Z Y， WANG Q G， et al. Real-time image semantic segmentation based on block adaptive feature fusion［J］. Acta Automatica Sinica， 2021， 47（5）： 1137-1148.
29	ZHU L Y， JI D Y， ZHU S P， et al. Learning statistical texture for semantic segmentation［C］// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2021： 12532-12541. 10.1109/cvpr46437.2021.01235
30	CHEN L C， WANG H Y， QIAO S Y. Scaling wide residual networks for panoptic segmentation［EB/OL］. （2021-02-08）［2022-01-21］..
31	BORSE S， WANG Y， ZHANG Y Z， et al. InverseForm： a loss function for structured boundary-aware segmentation［C］// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2021： 5897-5907. 10.1109/cvpr46437.2021.00584
32	HE K M， ZHANG X Y， REN S Q， et al. Deep residual learning for image recognition［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 770-778. 10.1109/cvpr.2016.90
33	CORDTS M， OMRAN M， RAMOS S， et al. The Cityscapes dataset for semantic urban scene understanding［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 3213-3223. 10.1109/cvpr.2016.350
34	ABADI M， AGARWAL A， BARHAM P， et al. TensorfFlow： large-scale machine learning on heterogeneous distributed systems［EB/OL］. （2016-03-16）［2021-11-16］..
35	LeCUN Y， BOTTOU L， ORR G B， et al. Efficient backprop［M］// ORR G B， MÜLLER K R. Neural Networks： Tricks of the Trade， LNCS 1524. Berlin： Springer， 1998： 9-50.
36	ZINKEVICH M， WEIMER M， LI L， et al. Parallelized stochastic gradient descent［C］// Proceedings of the 23rd International Conference on Neural Information Processing Systems - Volume 2. Red Hook， NY： Curran Associates Inc.， 2010： 2595-2603.
37	SUN K， XIAO B， LIU D， et al. Deep high-resolution representation learning for human pose estimation［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2019： 5693-5703. 10.1109/cvpr.2019.00584

方法	主干网络	扩展数据集	马路	人行道	建筑	墙	围栏	杆	信号灯	交通标识	植物	地面
CPN	ResNet-101	—	—	—	—	—	—	—	—	—	—	—
SPBGRN	ResNet-101	—	98.7	86.9	93.6	57.6	62.8	70.3	78.7	81.7	93.8	72.4
SCARN	ResNet-101	—	—	—	—	—	—	—	—	—	—	—
SBEPN	ResNet-101	—	—	—	—	—	—	—	—	—	—	—
STLN	ResNet-101	—	—	—	—	—	—	—	—	—	—	—
GPN	ResNet-101	—	98.8	87.8	93.8	61.8	63.3	70.4	78.9	81.7	94.0	72.4
CEN	ResNet-101	—	98.8	89.1	94.6	62.7	63.7	66.4	75.7	79.7	94.7	73.6
CAAN	ResNet-101	—	—	—	—	—	—	—	—	—	—	—
RCAN	HRNet-W48	—	—	—	—	—	—	—	—	—	—	—
OCRN	HRNet-W48	—	98.8	88.2	94.2	67.6	65.3	72.1	79.0	82.3	94.1	73.8
LGCAFN	ResNet-101	—	98.9	88.9	94.0	66.8	66.5	73.6	79.6	82.3	94.2	73.8
SWRN	SWideRNet-（1，1，4.5）	􀳫	98.8	88.4	94.6	68.2	68.6	76.0	81.2	84.7	94.3	74.1
HMAN	HRNet-W48	􀳫	98.9	89.3	94.9	71.8	68.3	75.8	82.1	85.2	94.4	74.9
ITN	HRNet-W48	􀳫	98.8	89.6	94.8	71.7	69.1	75.7	82.2	85.4	94.2	74.9
LGCAFN	ResNet-101	􀳫	99.0	89.3	95.0	73.4	72.3	76.3	82.5	86.3	94.7	75.6
方法	主干网络	扩展数据集	天空	行人	骑手	汽车	卡车	公交车	火车	摩托车	自行车	平均
CPN	ResNet-101	—	—	—	—	—	—	—	—	—	—	81.3
SPBGRN	ResNet-101	—	95.6	88.1	74.5	96.2	73.6	88.8	86.3	72.1	79.2	81.6
SCARN	ResNet-101	—	—	—	—	—	—	—	—	—	—	82.1
SBEPN	ResNet-101	—	—	—	—	—	—	—	—	—	—	82.2
STLN	ResNet-101	—	—	—	—	—	—	—	—	—	—	82.3
GPN	ResNet-101	—	95.9	88.2	74.8	96.4	80.4	91.1	85.4	72.0	78.6	82.5
CEN	ResNet-101	—	96.4	87.3	75.4	94.2	79.4	91.9	86.8	73.3	79.7	82.5
CAAN	ResNet-101	—	—	—	—	—	—	—	—	—	—	82.6
RCAN	HRNet-W48	—	—	—	—	—	—	—	—	—	—	82.7
OCRN	HRNet-W48	—	95.9	88.1	74.9	96.3	76.8	92.2	90.8	72.8	78.8	83.3
LGCAFN	ResNet-101	—	95.6	88.9	77.3	95.2	81.0	93.3	89.3	75.6	80.6	84.0
SWRN	SWideRNet-（1，1，4.5）	􀳫	96.2	89.7	79.7	96.7	82.0	94.1	92.1	77.1	79.2	85.1
HMAN	HRNet-W48	􀳫	96.3	90.1	79.7	96.9	82.5	94.6	87.8	77.1	81.7	85.4
ITN	HRNet-W48	􀳫	96.2	90.2	79.8	96.9	84.3	95.7	90.5	77.1	81.6	85.7
LGCAFN	ResNet-101	􀳫	96.0	90.5	80.4	97.0	84.2	94.6	91.1	78.9	82.4	86.3

方法	主干网络	扩展数据集	马路	人行道	建筑	墙	围栏	杆	信号灯	交通标识	植物	地面
CPN	ResNet-101	—	—	—	—	—	—	—	—	—	—	—
SPBGRN	ResNet-101	—	98.7	86.9	93.6	57.6	62.8	70.3	78.7	81.7	93.8	72.4
SCARN	ResNet-101	—	—	—	—	—	—	—	—	—	—	—
SBEPN	ResNet-101	—	—	—	—	—	—	—	—	—	—	—
STLN	ResNet-101	—	—	—	—	—	—	—	—	—	—	—
GPN	ResNet-101	—	98.8	87.8	93.8	61.8	63.3	70.4	78.9	81.7	94.0	72.4
CEN	ResNet-101	—	98.8	89.1	94.6	62.7	63.7	66.4	75.7	79.7	94.7	73.6
CAAN	ResNet-101	—	—	—	—	—	—	—	—	—	—	—
RCAN	HRNet-W48	—	—	—	—	—	—	—	—	—	—	—
OCRN	HRNet-W48	—	98.8	88.2	94.2	67.6	65.3	72.1	79.0	82.3	94.1	73.8
LGCAFN	ResNet-101	—	98.9	88.9	94.0	66.8	66.5	73.6	79.6	82.3	94.2	73.8
SWRN	SWideRNet-（1，1，4.5）	􀳫	98.8	88.4	94.6	68.2	68.6	76.0	81.2	84.7	94.3	74.1
HMAN	HRNet-W48	􀳫	98.9	89.3	94.9	71.8	68.3	75.8	82.1	85.2	94.4	74.9
ITN	HRNet-W48	􀳫	98.8	89.6	94.8	71.7	69.1	75.7	82.2	85.4	94.2	74.9
LGCAFN	ResNet-101	􀳫	99.0	89.3	95.0	73.4	72.3	76.3	82.5	86.3	94.7	75.6
方法	主干网络	扩展数据集	天空	行人	骑手	汽车	卡车	公交车	火车	摩托车	自行车	平均
CPN	ResNet-101	—	—	—	—	—	—	—	—	—	—	81.3
SPBGRN	ResNet-101	—	95.6	88.1	74.5	96.2	73.6	88.8	86.3	72.1	79.2	81.6
SCARN	ResNet-101	—	—	—	—	—	—	—	—	—	—	82.1
SBEPN	ResNet-101	—	—	—	—	—	—	—	—	—	—	82.2
STLN	ResNet-101	—	—	—	—	—	—	—	—	—	—	82.3
GPN	ResNet-101	—	95.9	88.2	74.8	96.4	80.4	91.1	85.4	72.0	78.6	82.5
CEN	ResNet-101	—	96.4	87.3	75.4	94.2	79.4	91.9	86.8	73.3	79.7	82.5
CAAN	ResNet-101	—	—	—	—	—	—	—	—	—	—	82.6
RCAN	HRNet-W48	—	—	—	—	—	—	—	—	—	—	82.7
OCRN	HRNet-W48	—	95.9	88.1	74.9	96.3	76.8	92.2	90.8	72.8	78.8	83.3
LGCAFN	ResNet-101	—	95.6	88.9	77.3	95.2	81.0	93.3	89.3	75.6	80.6	84.0
SWRN	SWideRNet-（1，1，4.5）	􀳫	96.2	89.7	79.7	96.7	82.0	94.1	92.1	77.1	79.2	85.1
HMAN	HRNet-W48	􀳫	96.3	90.1	79.7	96.9	82.5	94.6	87.8	77.1	81.7	85.4
ITN	HRNet-W48	􀳫	96.2	90.2	79.8	96.9	84.3	95.7	90.5	77.1	81.6	85.7
LGCAFN	ResNet-101	􀳫	96.0	90.5	80.4	97.0	84.2	94.6	91.1	78.9	82.4	86.3

方法	主干网络	参数量/10⁶	浮点运算量/ GFLOPs	mIoU/%
SWRN	SWideRNet-（1，1，4.5）	168.77	680.7	85.1
OCRN	HRNet-W48	67.25	410.6	83.3
CEN	ResNet-101	92.80	286.1	82.5
ITN	HRNet-W48	69.00	253.3	85.7
LGCAFN	ResNet-101	65.75	228.9	86.3

方法	主干网络	参数量/10⁶	浮点运算量/ GFLOPs	mIoU/%
SWRN	SWideRNet-（1，1，4.5）	168.77	680.7	85.1
OCRN	HRNet-W48	67.25	410.6	83.3
CEN	ResNet-101	92.80	286.1	82.5
ITN	HRNet-W48	69.00	253.3	85.7
LGCAFN	ResNet-101	65.75	228.9	86.3

模型	mIoU
Baseline	77.6
Baseline+CASPP	80.4
Baseline+CASPP+LSTM	82.8
Baseline+CASPP+LSTM+Attention	84.0

面向交通场景解析的局部和全局上下文注意力融合网络

Local and global context attentive fusion network for traffic scene parsing

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 11

参考文献 37

相关文章 15

编辑推荐

Metrics

方法	ResNet-101（r₁， r₂， r₃， r₄， r₅）	mIoU
1	ResNet-101（1，（1，1，1），（1，1，1，1），（1，1_6，1_4，1_4，1_4，1_4），（1，1，1））	77.6
	ResNet-101（2，（2，2，2），（2，2，2，2），（2，2_6，2_4，2_4，2_4，2_4），（2，2，2））	78.3
	ResNet-101（4，（4，4，4），（4，4，4，4），（4，4_6，4_4，4_4，4_4，4_4），（4，4，4））	78.6
	ResNet-101（8，（8，8，8），（8，8，8，8），（8，8_6，8_4，8_4，8_4，8_4），（8，8，8））	78.9
	ResNet-101（16，（16，16，16），（16，16，16，16），（16，16_6，16_4，16_4，16_4，16_4），（16，16，16））	77.8
	ResNet-101（24，（24，24，24），（24，24，24，24），（24，24_6，24_4，24_4，24_4，24_4），（24，24，24））	76.9
2	ResNet-101（2，（4，4，4），（8，8，8，8），（8，8_6，8_4，8_4，8_4，8_4），（16，16，16））	79.5
3	ResNet-101（2，（2，4，8），（2，4，8，16），（2，4_6，8_4，8_4，16_4，24_4），（4，8， 6））	80.4

方法	mIoU
LSTM（↓，↑，→，←）	82.3
LSTM（↘，↖，↙，↗）	81.6
LSTM（↓，↑，→，←，↘，↖，↙，↗）	82.8

方法	mIoU
Concatenation	83.1
Element-wise addition	83.3
Attention mechanism	84.0

[1]	汪雨晴, 朱广丽, 段文杰, 李书羽, 周若彤. 基于交互注意力机制的心理咨询文本情感分类模型[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2393-2399.
[2]	陈彤, 杨丰玉, 熊宇, 严荭, 邱福星. 基于多尺度频率通道注意力融合的声纹库构建方法[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2407-2413.
[3]	田润泽, 周宇龙, 朱洪, 薛岗. 基于局部信息的服务迁移路径选择算法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2168-2174.
[4]	徐泽鑫, 杨磊, 李康顺. 较短的长序列时间序列预测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1824-1831.
[5]	吕锡婷, 赵敬华, 荣海迎, 赵嘉乐. 基于Transformer和关系图卷积网络的信息传播预测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1760-1766.
[6]	罗歆然, 李天瑞, 贾真. 基于自注意力机制与词汇增强的中文医学命名实体识别[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 385-392.
[7]	花晓雨, 李冬芬, 付优, 毕可骏, 应时, 王瑞锦. 结合层次图神经网络与长短期记忆的产业链风险评估预警模型[J]. 《计算机应用》唯一官方网站, 2024, 44(10): 3223-3231.
[8]	朱志平, 杨燕, 王杰. 基于场景图感知的跨模态图像描述模型[J]. 《计算机应用》唯一官方网站, 2024, 44(1): 58-64.
[9]	陈丽安, 过弋. 融合个体偏差信息的文本情感分析模型[J]. 《计算机应用》唯一官方网站, 2024, 44(1): 145-151.
[10]	史含笑, 王雷春. 结合LSTM和自注意力机制的图卷积网络短期电力负荷预测[J]. 《计算机应用》唯一官方网站, 2024, 44(1): 311-317.
[11]	吴家皋, 章仕稳, 蒋宇栋, 刘林峰. 基于状态精细化长短期记忆和注意力机制的社交生成对抗网络用于行人轨迹预测[J]. 《计算机应用》唯一官方网站, 2023, 43(5): 1565-1570.
[12]	杨海宇, 郭文普, 康凯. 基于卷积长短时深度神经网络的信号调制方式识别方法[J]. 《计算机应用》唯一官方网站, 2023, 43(4): 1318-1322.
[13]	尹春勇, 周立文. 基于再编码的无监督时间序列异常检测模型[J]. 《计算机应用》唯一官方网站, 2023, 43(3): 804-811.
[14]	尹春勇, 张杨春. 基于CNN和Bi-LSTM的无监督日志异常检测模型[J]. 《计算机应用》唯一官方网站, 2023, 43(11): 3510-3516.
[15]	余本年, 詹永照, 毛启容, 董文龙, 刘洪麟. 面向语音增强的双复数卷积注意聚合递归网络[J]. 《计算机应用》唯一官方网站, 2023, 43(10): 3217-3224.