《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (7): 2155-2165.DOI: 10.11772/j.issn.1001-9081.2022060908
所属专题: 人工智能
收稿日期:
2022-06-23
修回日期:
2022-09-16
接受日期:
2022-09-22
发布日期:
2022-10-18
出版日期:
2023-07-10
通讯作者:
周静
作者简介:
周静(1981—),女,湖北襄阳人,教授,博士,主要研究方向:三维目标检测、深度学习;基金资助:
Jing ZHOU1(), Yiyu HU1, Chengyu HU2, Tianjiang WANG3
Received:
2022-06-23
Revised:
2022-09-16
Accepted:
2022-09-22
Online:
2022-10-18
Published:
2023-07-10
Contact:
Jing ZHOU
About author:
ZHOU Jing, born in 1981, Ph. D., professor. Her research interests include three-dimensional object detection, deep learning.Supported by:
摘要:
针对远距离或遮挡场景中形状缺失的弱感知目标的检测精确率低下的问题,提出一种基于点云补全和多分辨Transformer的弱感知目标检测方法(WP-CMT)。首先,考虑到目标检测网络中的下采样卷积操作会导致部分关键信息的丢失,选取具有反卷积上采样结构的部分感知聚合(Part-A2)方法作为基础网络以生成初始候选框;然后,为增强初始候选框中的弱感知目标形状及位置特征,采用点云补全模块重构弱感知目标表面的密集点集,并构建新颖的多分辨Transformer特征编码模块来聚合弱感知目标的补全形状特征和原始空间位置信息,通过逐步编码不同分辨率局部坐标点集上的聚合特征的上下文语义相关性来捕获弱感知目标增强的局部特征,最终生成精细化的目标检测框。实验结果表明:对于KITTI和Waymo数据集中的弱感知困难级别目标,WP-CMT的平均精确率和平均精确率均值分别比基准方法Part-A2提升了2.51和1.59个百分点,验证了该方法对弱感知目标检测的有效性。同时,消融实验结果表明WP-CMT中的点云补全和多分辨Transformer特征编码模块对于不同类型的区域候选网络(RPN)结构均能有效提升弱感知目标的检测性能。
中图分类号:
周静, 胡怡宇, 胡成玉, 王天江. 基于点云补全和多分辨Transformer的弱感知目标检测方法[J]. 计算机应用, 2023, 43(7): 2155-2165.
Jing ZHOU, Yiyu HU, Chengyu HU, Tianjiang WANG. Weakly perceived object detection method based on point cloud completion and multi-resolution Transformer[J]. Journal of Computer Applications, 2023, 43(7): 2155-2165.
检测方法 | mAP/% | AP 3D/% (IoU=0.7) | 时间/ms |
---|---|---|---|
困难 | |||
AVOD[ | 75.83 | 68.65 | 100 |
F-PointNet[ | 72.78 | 63.65 | 170 |
PI-RCNN[ | 80.56 | 76.17 | 107 |
SECOND[ | 81.48 | 77.22 | 50 |
PointPillars[ | 79.76 | 74.77 | 16 |
FastPoint RCNN[ | 81.87 | 77.48 | 65 |
TANet[ | 80.56 | 75.62 | 38 |
Associate-3Ddet[ | 82.07 | 77.76 | 60 |
HVNet[ | 78.86 | 71.79 | 32 |
PointRCNN[ | 81.63 | 77.38 | 100 |
SECOND-Gaussian+[ | 81.69 | 77.44 | 68 |
SA-SSD[ | 82.95 | 78.78 | 40 |
Part-A2[ | 82.49 | 78.54 | 83 |
WP-CMT(本文方法) | 84.24 | 79.22 | 87 |
表1 不同方法在KITTI验证集车辆类别上的对比
Tab. 1 Comparison of different methods on KITTI validation set for car class
检测方法 | mAP/% | AP 3D/% (IoU=0.7) | 时间/ms |
---|---|---|---|
困难 | |||
AVOD[ | 75.83 | 68.65 | 100 |
F-PointNet[ | 72.78 | 63.65 | 170 |
PI-RCNN[ | 80.56 | 76.17 | 107 |
SECOND[ | 81.48 | 77.22 | 50 |
PointPillars[ | 79.76 | 74.77 | 16 |
FastPoint RCNN[ | 81.87 | 77.48 | 65 |
TANet[ | 80.56 | 75.62 | 38 |
Associate-3Ddet[ | 82.07 | 77.76 | 60 |
HVNet[ | 78.86 | 71.79 | 32 |
PointRCNN[ | 81.63 | 77.38 | 100 |
SECOND-Gaussian+[ | 81.69 | 77.44 | 68 |
SA-SSD[ | 82.95 | 78.78 | 40 |
Part-A2[ | 82.49 | 78.54 | 83 |
WP-CMT(本文方法) | 84.24 | 79.22 | 87 |
检测方法 | AP 3D/% (IoU=0.7) | ||
---|---|---|---|
简单 | 中等 | 困难 | |
AVOD[ | 83.07 | 71.76 | 65.73 |
F-PointNet[ | 82.19 | 69.79 | 60.59 |
PI-RCNN[ | 84.37 | 74.82 | 70.03 |
SECOND[ | 83.13 | 73.66 | 66.20 |
PointPillars[ | 82.58 | 74.31 | 68.99 |
Fast PointRCNN[ | 85.29 | 77.40 | 70.24 |
TANet[ | 83.81 | 75.38 | 67.66 |
Associate-3Ddet[ | 85.99 | 77.40 | 70.53 |
PointRCNN[ | 86.96 | 75.64 | 70.70 |
MSPTRCNN[ | 87.45 | 77.44 | 70.39 |
Part-A2[ | 87.81 | 78.49 | 73.51 |
Pointformer[ | 87.13 | 77.06 | 69.25 |
SA-SSD[ | 88.75 | 79.79 | 74.16 |
WP-CMT(本文方法) | 87.47 | 80.52 | 76.02 |
表2 不同方法在KITTI测试集车辆类别上的对比
Tab. 2 Comparison of different methods on KITTI test set for car class
检测方法 | AP 3D/% (IoU=0.7) | ||
---|---|---|---|
简单 | 中等 | 困难 | |
AVOD[ | 83.07 | 71.76 | 65.73 |
F-PointNet[ | 82.19 | 69.79 | 60.59 |
PI-RCNN[ | 84.37 | 74.82 | 70.03 |
SECOND[ | 83.13 | 73.66 | 66.20 |
PointPillars[ | 82.58 | 74.31 | 68.99 |
Fast PointRCNN[ | 85.29 | 77.40 | 70.24 |
TANet[ | 83.81 | 75.38 | 67.66 |
Associate-3Ddet[ | 85.99 | 77.40 | 70.53 |
PointRCNN[ | 86.96 | 75.64 | 70.70 |
MSPTRCNN[ | 87.45 | 77.44 | 70.39 |
Part-A2[ | 87.81 | 78.49 | 73.51 |
Pointformer[ | 87.13 | 77.06 | 69.25 |
SA-SSD[ | 88.75 | 79.79 | 74.16 |
WP-CMT(本文方法) | 87.47 | 80.52 | 76.02 |
检测方法 | mAP/%(IoU=0.7) | mAPH/%(IoU=0.7) | ||
---|---|---|---|---|
级别1 | 级别2 | 级别1 | 级别2 | |
Pointpillars[ | 63.30 | 55.20 | 62.70 | 54.70 |
SECOND[ | 68.03 | 59.57 | 67.44 | 59.04 |
StarNet[ | 64.70 | 45.50 | 56.30 | 39.60 |
MVF[ | 62.93 | — | — | — |
PointRCNN[ | 45.05 | 37.41 | 44.25 | 36.74 |
Pyramid-P[ | 47.02 | 39.10 | 46.58 | 38.76 |
Part-A2[ | 71.69 | 64.21 | 71.16 | 63.70 |
WP-CMT | 73.04 | 65.80 | 72.52 | 65.31 |
表3 不同方法在Waymo数据集的验证序列上的对比
Tab. 3 Comparison of different methods on validation sequences of Waymo dataset
检测方法 | mAP/%(IoU=0.7) | mAPH/%(IoU=0.7) | ||
---|---|---|---|---|
级别1 | 级别2 | 级别1 | 级别2 | |
Pointpillars[ | 63.30 | 55.20 | 62.70 | 54.70 |
SECOND[ | 68.03 | 59.57 | 67.44 | 59.04 |
StarNet[ | 64.70 | 45.50 | 56.30 | 39.60 |
MVF[ | 62.93 | — | — | — |
PointRCNN[ | 45.05 | 37.41 | 44.25 | 36.74 |
Pyramid-P[ | 47.02 | 39.10 | 46.58 | 38.76 |
Part-A2[ | 71.69 | 64.21 | 71.16 | 63.70 |
WP-CMT | 73.04 | 65.80 | 72.52 | 65.31 |
实验序号 | 检测方法 | AP 3D/% (IoU=0.7) | AP 3D/% (IoU=0.5) | ||||
---|---|---|---|---|---|---|---|
简单 | 中等 | 困难 | 简单 | 中等 | 困难 | ||
A | 消融基线 | 88.98 | 79.11 | 77.82 | 90.59 | 89.20 | 89.14 |
B | 消融基线+MTE | 89.18 | 81.77 | 78.24 | 95.64 | 89.61 | 89.33 |
C | 消融基线+TPC | 89.03 | 80.67 | 78.02 | 94.77 | 89.52 | 89.18 |
D | WP-CMT | 89.58 | 83.93 | 79.22 | 97.14 | 89.67 | 89.35 |
E | Part-A2 | 89.47 | 79.47 | 78.54 | 94.71 | 89.27 | 89.03 |
表4 TPC和MTE模块的消融实验结果
Tab. 4 Ablation experimental results of TPC and MTE modules
实验序号 | 检测方法 | AP 3D/% (IoU=0.7) | AP 3D/% (IoU=0.5) | ||||
---|---|---|---|---|---|---|---|
简单 | 中等 | 困难 | 简单 | 中等 | 困难 | ||
A | 消融基线 | 88.98 | 79.11 | 77.82 | 90.59 | 89.20 | 89.14 |
B | 消融基线+MTE | 89.18 | 81.77 | 78.24 | 95.64 | 89.61 | 89.33 |
C | 消融基线+TPC | 89.03 | 80.67 | 78.02 | 94.77 | 89.52 | 89.18 |
D | WP-CMT | 89.58 | 83.93 | 79.22 | 97.14 | 89.67 | 89.35 |
E | Part-A2 | 89.47 | 79.47 | 78.54 | 94.71 | 89.27 | 89.03 |
AF子模块中各步操作 | AP 3D/% (IoU=0.7) | |||||
---|---|---|---|---|---|---|
拼接 | 加权 | 激活 | 残差 | 简单 | 中等 | 困难 |
√ | × | × | × | 89.23 | 82.95 | 78.80 |
√ | √ | × | × | 89.45 | 83.82 | 79.15 |
√ | √ | √ | × | 89.49 | 83.85 | 79.18 |
√ | √ | √ | √ | 89.58 | 83.93 | 79.22 |
表5 AF子模块的消融实验结果
Tab. 5 Results of ablation experiments for AF sub-module
AF子模块中各步操作 | AP 3D/% (IoU=0.7) | |||||
---|---|---|---|---|---|---|
拼接 | 加权 | 激活 | 残差 | 简单 | 中等 | 困难 |
√ | × | × | × | 89.23 | 82.95 | 78.80 |
√ | √ | × | × | 89.45 | 83.82 | 79.15 |
√ | √ | √ | × | 89.49 | 83.85 | 79.18 |
√ | √ | √ | √ | 89.58 | 83.93 | 79.22 |
实验序号 | 方法 | AP 3D/% (IoU=0.7) | ||
---|---|---|---|---|
简单 | 中等 | 困难 | ||
F | PointNet++编码 | 89.29 | 82.99 | 78.56 |
G | MT编码 (本文方法) | 89.58 | 83.93 | 79.22 |
表6 MTE模块中不同特征编码子模块的对比
Tab. 6 Comparison of different feature encoding sub-modules in MTE module
实验序号 | 方法 | AP 3D/% (IoU=0.7) | ||
---|---|---|---|---|
简单 | 中等 | 困难 | ||
F | PointNet++编码 | 89.29 | 82.99 | 78.56 |
G | MT编码 (本文方法) | 89.58 | 83.93 | 79.22 |
方法 | AP 3D/% (IoU=0.7) | ||
---|---|---|---|
简单 | 中等 | 困难 | |
PointRCNN | 88.88 | 78.63 | 77.39 |
PointRCNN+TPC+MTE | 89.49 | 79.98 | 78.62 |
Part-A2 | 89.47 | 79.47 | 78.54 |
Part-A2+TPC+MTE(本文方法) | 89.58 | 83.93 | 79.22 |
表7 不同RPN结构的对比
Tab. 7 Comparison of different RPN structures
方法 | AP 3D/% (IoU=0.7) | ||
---|---|---|---|
简单 | 中等 | 困难 | |
PointRCNN | 88.88 | 78.63 | 77.39 |
PointRCNN+TPC+MTE | 89.49 | 79.98 | 78.62 |
Part-A2 | 89.47 | 79.47 | 78.54 |
Part-A2+TPC+MTE(本文方法) | 89.58 | 83.93 | 79.22 |
1 | CHEN X Z, MA H M, WAN J, et al. Multi-view 3D object detection network for autonomous driving [C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 6526-6534. 10.1109/cvpr.2017.691 |
2 | KU J, MOZIFIAN M, LEE J, et al. Joint 3D proposal generation and object detection from view aggregation [C]// Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway: IEEE, 2018: 1-8. 10.1109/iros.2018.8594049 |
3 | LIANG M, YANG B, CHEN Y, et al. Multi-task multi-sensor fusion for 3D object detection [C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 7337-7345. 10.1109/cvpr.2019.00752 |
4 | ZHOU Y, SUN P, ZHANG Y, et al. End-to-end multi-view fusion for 3D object detection in LiDAR point clouds [C]// Proceedings of the 3rd Conference on Robot Learning. New York: JMLR.org, 2020: 923-932. |
5 | DENG J J, ZHOU W G, ZHANG Y Y, et al. From multi-view to Hollow-3D: hallucinated Hollow-3D R-CNN for 3D object detection[J]. IEEE Transactions on Circuits Systems for Video Technology, 2021, 31(12): 4722-4734. 10.1109/tcsvt.2021.3100848 |
6 | QI C R, SU H, MO K C, et al. PointNet: deep learning on point sets for 3D classification and segmentation [C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 77-85. 10.1109/cvpr.2017.16 |
7 | QI C R, YI L, SU H, et al. PointNet++: deep hierarchical feature learning on point sets in a metric space [C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook, NY: Curran Associates Inc., 2017: 5105-5114. |
8 | QI C R, LIU W, WU C X, et al. Frustum PointNets for 3D object detection from RGB-D data [C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 918-927. 10.1109/cvpr.2018.00102 |
9 | QI C R, LITANY O, HE K M, et al. Deep Hough voting for 3D object detection in point clouds [C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 9276-9285. 10.1109/iccv.2019.00937 |
10 | SHI S S, WANG X G, LI H S. PointRCNN: 3D object proposal generation and detection from point cloud [C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 770-779. 10.1109/cvpr.2019.00086 |
11 | YANG Z T, SUN Y N, LIU S, et al. 3DSSD: point-based 3D single stage object detector [C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 11037-11045. 10.1109/cvpr42600.2020.01105 |
12 | MISRA I, GIRDHAR R, JOULIN A. An end-to-end transformer model for 3D object detection [C]// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 2886-2897. 10.1109/iccv48922.2021.00290 |
13 | LIU Z, ZHANG Z, CAO Y, et al. Group-free 3D object detection via transformers [C]// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 2929-2938. 10.1109/iccv48922.2021.00294 |
14 | PAN X R, XIA Z F, SONG S J, et al. 3D object detection with Pointformer [C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 7459-7468. 10.1109/cvpr46437.2021.00738 |
15 | 孙刘杰,赵进,王文举,等.多尺度Transformer激光雷达点云3D物体检测[J].计算机工程与应用, 2022, 58(8): 136-146. 10.3778/j.issn.1002-8331.2109-0489 |
SUN L J, ZHAO J, WANG W J, et al. Multi-scale transformer LiDAR point cloud 3D object detection[J]. Computer Engineering and Applications, 2022, 58(8): 136-146. 10.3778/j.issn.1002-8331.2109-0489 | |
16 | ZHANG Y F, HU Q Y, XU G Q, et al. Not all points are equal: learning highly efficient point-based detectors for 3D LiDAR point clouds [C]// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 18931-18940. 10.1109/cvpr52688.2022.01838 |
17 | YAN Y, MAO Y X, LI B. SECOND: sparsely embedded convolutional detection[J]. Sensors, 2018, 18(10): No.3337. 10.3390/s18103337 |
18 | LANG A H, VORA S, CAESAR H, et al. PointPillars: fast encoders for object detection from point clouds [C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 12689-12697. 10.1109/cvpr.2019.01298 |
19 | LIU Z, ZHAO X, HUANG T T, et al. TANet: robust 3D object detection from point clouds with triple attention [C]// Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2020: 11677-11684. 10.1609/aaai.v34i07.6837 |
20 | YE M S, XU S J, CAO T Y. HVNet: hybrid voxel network for LiDAR based 3D object detection [C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 1628-1637. 10.1109/cvpr42600.2020.00170 |
21 | DENG J J, SHI S S, LI P W, et al. Voxel R-CNN: towards high performance voxel-based 3D object detection [C]// Proceedings of the 35th AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2021: 1201-1209. 10.1609/aaai.v35i2.16207 |
22 | ZHANG W C, LI W, XU D. SRDAN: scale-aware and range-aware domain adaptation network for cross-dataset 3D object detection [C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 6765-6775. 10.1109/cvpr46437.2021.00670 |
23 | HE C H, LI R H, LI S, et al. Voxel set Transformer: a set-to-set approach to 3D object detection from point clouds [C]// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 8407-8417. 10.1109/cvpr52688.2022.00823 |
24 | HE C H, ZENG H, HUANG J Q, et al. Structure aware single-stage 3D object detection from point cloud [C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 11870-11879. 10.1109/cvpr42600.2020.01189 |
25 | SHI S S, WANG Z, SHI J P, et al. From points to parts: 3D object detection from point cloud with part-aware and part-aggregation network[J]. IEEE Transactions on Pattern Analysis Machine Intelligence, 2021, 43(8): 2647-2664. |
26 | XIE L, XIANG C, YU Z X, et al. PI-RCNN: an efficient multi-sensor 3D object detector with point-based attentive Cont-Conv fusion module [C]// Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2020: 12460-12467. 10.1609/aaai.v34i07.6933 |
27 | CHEN Y L, LIU S, SHEN X Y, et al. Fast point R-CNN [C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 9774-9783. 10.1109/iccv.2019.00987 |
28 | DU L, YE X Q, TAN X, et al. Associate-3Ddet: perceptual-to-conceptual association for 3D point cloud object detection [C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 13326-13335. 10.1109/cvpr42600.2020.01334 |
29 | 裴仪瑶,郭会明,张丹普,等.基于定位不确定性的鲁棒3D目标检测方法[J].计算机应用, 2021, 41(10): 2979-2984. 10.11772/j.issn.1001-9081.2020122055 |
PEI Y Y, GUO H M, ZHANG D P, et al. Robust 3D object detection method based on localization uncertainty[J]. Journal of Computer Applications, 2021, 41(10): 2979-2984. 10.11772/j.issn.1001-9081.2020122055 | |
30 | NGIAM J, CAINE B, HAN W, et al. StarNet: targeted computation for object detection in point clouds[EB/OL]. (2019-12-02) [2022-06-19]. . |
31 | MAO J G, NIU M Z, BAI H Y, et al. Pyramid R-CNN: towards better performance and adaptability for 3D object detection [C]// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 2703-2712. 10.1109/iccv48922.2021.00272 |
[1] | 刘越, 刘芳, 武奥运, 柴秋月, 王天笑. 基于自注意力机制与图卷积的3D目标检测网络[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1972-1977. |
[2] | 甘玲, 左永强. 基于快速低秩编码与局部约束的图像分类算法[J]. 计算机应用, 2017, 37(10): 2912-2915. |
[3] | 魏上清 顾晓东. 基于均衡离散曲率波变换的手背静脉识别[J]. 计算机应用, 2012, 32(04): 1122-1125. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||