《计算机应用》唯一官方网站 ›› 2025, Vol. 45 ›› Issue (9): 3003-3010.DOI: 10.11772/j.issn.1001-9081.2024091254
• 多媒体计算与计算机仿真 • 上一篇
收稿日期:
2024-09-05
修回日期:
2024-10-16
接受日期:
2024-10-18
发布日期:
2024-10-31
出版日期:
2025-09-10
通讯作者:
邵佳乐
作者简介:
李维刚(1977—),男,湖北咸宁人,教授,博士,主要研究方向:工业过程控制、人工智能、机器学习; 深度学习、点云数据处理基金资助:
Weigang LI1,2, Jiale SHAO1(), Zhiqiang TIAN2
Received:
2024-09-05
Revised:
2024-10-16
Accepted:
2024-10-18
Online:
2024-10-31
Published:
2025-09-10
Contact:
Jiale SHAO
About author:
LI Weigang, born in 1977, Ph. D., professor. His research interests include industrial process control, artificial intelligence, machine learning.Supported by:
摘要:
现有的网络难以有效学习点云局部的几何形状信息,存在无法有效关注重要特征结构和融合不充分等问题。因此,提出一种基于双注意力机制(DAM)和多尺度融合的点云分类与分割网络。首先,在数据特征提取阶段利用几何自适应卷积(GAC)动态地调整卷积核的几何位置和权重,使它能够动态适应点云数据的局部几何结构,从而更有效地捕捉局部特征;其次,为了进一步提升特征表达能力,引入DAM自动学习并调整特征通道和空间信息的权重,从而增强关键点的特征表示;最后,连接不同尺度的特征信息以进行有效融合,从而增强特征学习效果,使得最终的特征表示更加丰富,以提高网络的分类分割精度。在ModelNet40、ShapeNet和S3DIS数据集上的实验结果表明,所提网络与PointNet++和DGCNN(Dynamic Graph Convolutional Neural Network)相比,总体分类精度(OA)和平均交并比(mIoU)更好,有效提升了点云分类与分割的性能。
中图分类号:
李维刚, 邵佳乐, 田志强. 基于双注意力机制和多尺度融合的点云分类与分割网络[J]. 计算机应用, 2025, 45(9): 3003-3010.
Weigang LI, Jiale SHAO, Zhiqiang TIAN. Point cloud classification and segmentation network based on dual attention mechanism and multi-scale fusion[J]. Journal of Computer Applications, 2025, 45(9): 3003-3010.
方法 | 输入 | mAcc/% | OA/% | 运算量/GFLOPs |
---|---|---|---|---|
VoxNet | voxel | 83.0 | 85.9 | — |
MVCNN | image | — | 90.1 | — |
PointNet | point | 86.2 | 89.2 | 0.440 |
PointNet++ | point | 88.3 | 90.7 | 0.870 |
DGCNN | point | 90.2 | 92.9 | 2.450 |
PCNN | point | 88.1 | 92.2 | — |
PointConv | point | — | 92.5 | — |
PCT | point | — | 93.2 | 2.320 |
PointWeb | point | 89.4 | 92.3 | — |
Point Transformer | point | — | 92.8 | — |
JGEKD | point | 90.9 | 93.4 | — |
本文方法 | point | 91.3 | 93.5 | 2.204 |
表1 ModelNet40数据集上的分类实验结果对比
Tab. 1 Comparison of classification experimental results on ModelNet40 dataset
方法 | 输入 | mAcc/% | OA/% | 运算量/GFLOPs |
---|---|---|---|---|
VoxNet | voxel | 83.0 | 85.9 | — |
MVCNN | image | — | 90.1 | — |
PointNet | point | 86.2 | 89.2 | 0.440 |
PointNet++ | point | 88.3 | 90.7 | 0.870 |
DGCNN | point | 90.2 | 92.9 | 2.450 |
PCNN | point | 88.1 | 92.2 | — |
PointConv | point | — | 92.5 | — |
PCT | point | — | 93.2 | 2.320 |
PointWeb | point | 89.4 | 92.3 | — |
Point Transformer | point | — | 92.8 | — |
JGEKD | point | 90.9 | 93.4 | — |
本文方法 | point | 91.3 | 93.5 | 2.204 |
方法 | 不同类别的IoU | Cls.mIoU | mIoU | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
飞机 | 包 | 帽子 | 车 | 椅子 | 耳机 | 吉他 | 刀 | 灯 | 电脑 | 摩托 | 杯子 | 手枪 | 火箭 | 滑板 | 桌子 | |||
PointNet | 83.4 | 78.7 | 82.5 | 74.9 | 89.6 | 73.0 | 91.5 | 85.9 | 80.8 | 95.3 | 65.2 | 93.0 | 81.2 | 57.9 | 72.8 | 80.6 | 80.4 | 83.7 |
PointNet++ | 82.4 | 79.0 | 87.7 | 77.3 | 90.8 | 71.8 | 91.0 | 85.9 | 83.7 | 95.3 | 71.6 | 94.1 | 81.3 | 58.7 | 76.4 | 82.6 | 81.9 | 85.1 |
DGCNN | 84.0 | 83.4 | 86.7 | 77.8 | 90.6 | 74.7 | 91.2 | 87.5 | 82.8 | 95.7 | 66.3 | 94.9 | 81.1 | 63.5 | 74.5 | 82.6 | 82.3 | 85.2 |
LDGCNN | 84.0 | 83.0 | 84.9 | 78.4 | 90.6 | 74.4 | 91.0 | 88.1 | 83.4 | 95.8 | 67.4 | 94.9 | 82.3 | 59.2 | 76.0 | 81.9 | 82.2 | 85.1 |
PCNN | 82.4 | 80.1 | 85.5 | 79.5 | 90.8 | 73.2 | 91.3 | 86.0 | 85.0 | 95.7 | 73.2 | 94.8 | 83.3 | 51.0 | 75.0 | 81.8 | 81.8 | 85.1 |
PointASNL | 84.1 | 84.7 | 87.9 | 79.7 | 92.2 | 73.7 | 91.0 | 87.2 | 84.2 | 95.8 | 74.4 | 95.2 | 81.0 | 63.0 | 76.3 | 83.2 | 83.3 | 86.1 |
本文方法 | 83.3 | 85.3 | 90.4 | 77.6 | 90.8 | 78.1 | 91.4 | 88.0 | 85.1 | 95.9 | 72.5 | 95.3 | 82.2 | 63.7 | 77.8 | 83.1 | 83.8 | 86.4 |
表2 ShapeNet数据集上不同方法的部件分割性能对比 (%)
Tab. 2 Comparison of component segmentation performance of different methods on ShapeNet dataset
方法 | 不同类别的IoU | Cls.mIoU | mIoU | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
飞机 | 包 | 帽子 | 车 | 椅子 | 耳机 | 吉他 | 刀 | 灯 | 电脑 | 摩托 | 杯子 | 手枪 | 火箭 | 滑板 | 桌子 | |||
PointNet | 83.4 | 78.7 | 82.5 | 74.9 | 89.6 | 73.0 | 91.5 | 85.9 | 80.8 | 95.3 | 65.2 | 93.0 | 81.2 | 57.9 | 72.8 | 80.6 | 80.4 | 83.7 |
PointNet++ | 82.4 | 79.0 | 87.7 | 77.3 | 90.8 | 71.8 | 91.0 | 85.9 | 83.7 | 95.3 | 71.6 | 94.1 | 81.3 | 58.7 | 76.4 | 82.6 | 81.9 | 85.1 |
DGCNN | 84.0 | 83.4 | 86.7 | 77.8 | 90.6 | 74.7 | 91.2 | 87.5 | 82.8 | 95.7 | 66.3 | 94.9 | 81.1 | 63.5 | 74.5 | 82.6 | 82.3 | 85.2 |
LDGCNN | 84.0 | 83.0 | 84.9 | 78.4 | 90.6 | 74.4 | 91.0 | 88.1 | 83.4 | 95.8 | 67.4 | 94.9 | 82.3 | 59.2 | 76.0 | 81.9 | 82.2 | 85.1 |
PCNN | 82.4 | 80.1 | 85.5 | 79.5 | 90.8 | 73.2 | 91.3 | 86.0 | 85.0 | 95.7 | 73.2 | 94.8 | 83.3 | 51.0 | 75.0 | 81.8 | 81.8 | 85.1 |
PointASNL | 84.1 | 84.7 | 87.9 | 79.7 | 92.2 | 73.7 | 91.0 | 87.2 | 84.2 | 95.8 | 74.4 | 95.2 | 81.0 | 63.0 | 76.3 | 83.2 | 83.3 | 86.1 |
本文方法 | 83.3 | 85.3 | 90.4 | 77.6 | 90.8 | 78.1 | 91.4 | 88.0 | 85.1 | 95.9 | 72.5 | 95.3 | 82.2 | 63.7 | 77.8 | 83.1 | 83.8 | 86.4 |
方法 | mIoU | mAcc |
---|---|---|
PointNet | 41.1 | 48.9 |
PointNet++ | 50.6 | — |
DGCNN | 47.0 | — |
PCNN | 57.3 | 63.9 |
本文方法 | 59.1 | 65.3 |
表3 不同方法的语义分割实验结果对比 (%)
Tab. 3 Comparison of semantic segmentation experimental results using different methods
方法 | mIoU | mAcc |
---|---|---|
PointNet | 41.1 | 48.9 |
PointNet++ | 50.6 | — |
DGCNN | 47.0 | — |
PCNN | 57.3 | 63.9 |
本文方法 | 59.1 | 65.3 |
权重矩阵数 | OA/% | 权重矩阵数 | OA/% |
---|---|---|---|
2 | 92.2 | 8 | 93.4 |
4 | 92.4 | 16 | 93.0 |
表4 不同权重矩阵数量的OA
Tab. 4 OAs with different numbers of weight matrices
权重矩阵数 | OA/% | 权重矩阵数 | OA/% |
---|---|---|---|
2 | 92.2 | 8 | 93.4 |
4 | 92.4 | 16 | 93.0 |
方法 | mAcc | OA |
---|---|---|
PointNet++ | 88.3 | 90.7 |
+GAC | 90.4 | 92.4 |
+GAC+DAM | 90.7 | 92.8 |
+GAC+MSFF | 90.8 | 93.1 |
本文方法 | 91.3 | 93.5 |
表5 ModelNet40数据集上的消融实验结果 (%)
Tab. 5 Results of ablation experiments on ModelNet40 dataset
方法 | mAcc | OA |
---|---|---|
PointNet++ | 88.3 | 90.7 |
+GAC | 90.4 | 92.4 |
+GAC+DAM | 90.7 | 92.8 |
+GAC+MSFF | 90.8 | 93.1 |
本文方法 | 91.3 | 93.5 |
[1] | LI Y, MA L, ZHONG Z, et al. Deep learning for LiDAR point clouds in autonomous driving: a review [J]. IEEE Transactions on Neural Networks and Learning Systems, 2021, 32(8): 3412-3432. |
[2] | JIA Z, YUAN H, ZHAO X, et al. Single-cell genetic analysis of lung tumor cells based on self-driving micro-cavity array chip [J]. Talanta, 2021, 226: No.122172. |
[3] | FAN T, ZHANG R. Research on automatic lane line extraction method based on onboard lidar point cloud data [C]// Proceedings of the 2nd International Conference on Digital Signal and Computer Communications. Bellingham, WA: SPIE, 2022: No.123060P. |
[4] | 李佳男,王泽,许廷发. 基于点云数据的三维目标检测技术研究进展[J]. 光学学报, 2023, 43(15): No.1515001. |
LI J N, WANG Z, XU T F. Three-dimensional object detection technology based on point cloud data [J]. Acta Optica Sinica, 2023, 43(15): No.1515001. | |
[5] | 史怡,魏东,宋强,等. 基于动态图卷积和离散哈特莱转换差异性池化的点云数据分类分割网络[J]. 计算机应用, 2022, 42(S1):292-297. |
SHI Y, WEI D, SONG Q, et al. Point cloud data classification and segmentation network based on dynamic graph convolution and discrete Hartley transform different pooling[J]. Journal of Computer Applications, 2022, 42(S1):292-297. | |
[6] | SCHULT J, ENGELMANN F, HERMANS A, et al. Mask3D: mask Transformer for 3D semantic instance segmentation [C]// Proceedings of the 2023 IEEE International Conference on Robotics and Automation. Piscataway: IEEE, 2023: 8216-8223. |
[7] | 李维刚,陈婷,田志强. 基于孪生自适应图卷积算法的点云分类与分割[J]. 计算机应用, 2023, 43(11): 3396-3402. |
LI W G, CHEN T, TIAN Z Q. Point cloud classification and segmentation based on Siamese adaptive graph convolution algorithm [J]. Journal of Computer Applications, 2023, 43(11): 3396-3402. | |
[8] | GUO Y, WANG H, HU Q, et al. Deep learning for 3D point clouds: a survey [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(12): 4338-4364. |
[9] | BAI X, ZHOU J, NING X, et al. 3D data computation and visualization [J]. Displays, 2022, 73: No.102169. |
[10] | MATURANA D, SCHERER S. VoxNet: a 3D convolutional neural network for real-time object recognition [C]// Proceedings of the 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway: IEEE, 2015: 922-928. |
[11] | SU H, MAJI S, KALOGERAKIS E, et al. Multi-view convolutional neural networks for 3D shape recognition [C]// Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2015: 945-953. |
[12] | QI C R, SU H, MO K, et al. PointNet: deep learning on point sets for 3D classification and segmentation [C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 77-85. |
[13] | QI C R, YI L, SU H, et al. PointNet++: deep hierarchical feature learning on point sets in a metric space [C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2017: 5105-5114. |
[14] | LI Y, BU R, SUN M, et al. PointCNN: convolution on X-transformed points [C]// Proceedings of the 32nd International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2018: 828-838. |
[15] | ZHAO H, JIANG L, FU C W, et al. PointWeb: enhancing local neighborhood features for point cloud processing [C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE: 5560-5568. |
[16] | WANG Y, SUN Y, LIU Z, et al. Dynamic graph CNN for learning on point clouds [J]. ACM Transactions on Graphics, 2019, 38(5): No.146. |
[17] | ZHANG K, HAO M, WANG J, et al. Linked dynamic graph CNN: Learning through point cloud by linking hierarchical features[C]// Proceedings of the 27th International Conference on Mechatronics and Machine Vision in Practice. Piscataway: IEEE, 2021: 7-12. |
[18] | WU W, QI Z, FUXIN L. PointConv: deep convolutional networks on 3D point clouds [C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 9613-9622. |
[19] | ZHOU L, LIU Y, ZHANG P, et al. Information bottleneck and selective noise supervision for zero-shot learning [J]. Machine Learning, 2023, 112(7): 2239-2261. |
[20] | ZHAO H, JIANG L, JIA J, et al. Point Transformer [C]// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 16239-16248. |
[21] | REN D, WU Z, LI J, et al. Point attention network for point cloud semantic segmentation [J]. SCIENCE CHINA Information Sciences, 2022, 65(9): No.192104. |
[22] | CHEN C, WANG Y, CHEN H, et al. GeoSegNet: point cloud semantic segmentation via geometric encoder-decoder modeling[J]. The Visual Computer, 2024, 40(8): 5107-5121. |
[23] | WU C, ZHENG J, PFROMMER J, et al. Attention-based point cloud edge sampling [C]// Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2023: 5333-5343. |
[24] | TIAN Z, LI W, HU J, et al. Joint graph entropy knowledge distillation for point cloud classification and robustness against corruptions [J]. Information Sciences, 2023, 648: No.119542. |
[25] | ZHOU W, ZHAO Y, XIAO Y, et al. TNPC: Transformer-based network for point cloud classification [J]. Expert Systems with Applications, 2024, 239: No.122438. |
[26] | 于丽丽,于海洋,何子鑫,等. 基于双注意力机制和多尺度特征的点云场景分割[J]. 激光与光电子学进展, 2021, 58(24): No.428007. |
YU L L, YU H Y, HE Z X, et al. Point cloud scene segmentation based on dual attention mechanism and multi-scale features [J]. Laser and Optoelectronics Progress, 2021, 58(24): No.428007. | |
[27] | ATZMON M, MARON H, LIPMAN Y. Point convolutional neural networks by extension operators [J]. ACM Transactions on Graphics, 2018, 37(4): No.71. |
[28] | GUO M H, CAI J X, LIU Z N, et al. PCT: point cloud Transformer [J]. Computational Visual Media, 2021, 7(2): 187-199. |
[29] | YAN X, ZHENG C, LI Z, et al. PointASNL: robust point clouds processing using nonlocal neural networks with adaptive sampling[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 5588-5597. |
[30] | WU Z, SONG S, KHOSLA A, et al. 3D ShapeNets: a deep representation for volumetric shapes [C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2015: 1912-1920. |
[31] | YI L, KIM V G, CEYLAN D, et al. A scalable active framework for region annotation in 3D shape collections [J]. ACM Transactions on Graphics, 2016, 35(6): No.210. |
[32] | ARMENI I, SENER O, ZAMIR A R, et al. 3D semantic parsing of large-scale indoor spaces [C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 1534-1543. |
[1] | 王翔, 陈志祥, 毛国君. 融合局部和全局相关性的多变量时间序列预测方法[J]. 《计算机应用》唯一官方网站, 2025, 45(9): 2806-2816. |
[2] | 梁一鸣, 范菁, 柴汶泽. 基于双向交叉注意力的多尺度特征融合情感分类[J]. 《计算机应用》唯一官方网站, 2025, 45(9): 2773-2782. |
[3] | 张宏俊, 潘高军, 叶昊, 陆玉彬, 缪宜恒. 结合深度学习和张量分解的多源异构数据分析方法[J]. 《计算机应用》唯一官方网站, 2025, 45(9): 2838-2847. |
[4] | 李进, 刘立群. 基于残差Swin Transformer的SAR与可见光图像融合[J]. 《计算机应用》唯一官方网站, 2025, 45(9): 2949-2956. |
[5] | 殷兵, 凌震华, 林垠, 奚昌凤, 刘颖. 兼容缺失模态推理的情感识别方法[J]. 《计算机应用》唯一官方网站, 2025, 45(9): 2764-2772. |
[6] | 邓伊琳, 余发江. 基于LSTM和可分离自注意力机制的伪随机数生成器[J]. 《计算机应用》唯一官方网站, 2025, 45(9): 2893-2901. |
[7] | 景攀峰, 梁宇栋, 李超伟, 郭俊茹, 郭晋育. 基于师生学习的半监督图像去雾算法[J]. 《计算机应用》唯一官方网站, 2025, 45(9): 2975-2983. |
[8] | 吕景刚, 彭绍睿, 高硕, 周金. 复频域注意力和多尺度频域增强驱动的语音增强网络[J]. 《计算机应用》唯一官方网站, 2025, 45(9): 2957-2965. |
[9] | 颜承志, 陈颖, 钟凯, 高寒. 基于多尺度网络与轴向注意力的3D目标检测算法[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2537-2545. |
[10] | 廖炎华, 鄢元霞, 潘文林. 基于YOLOv9的交通路口图像的多目标检测算法[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2555-2565. |
[11] | 吴海峰, 陶丽青, 程玉胜. 集成特征注意力和残差连接的偏标签回归算法[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2530-2536. |
[12] | 彭鹏, 蔡子婷, 刘雯玲, 陈才华, 曾维, 黄宝来. 基于CNN和双向GRU混合孪生网络的语音情感识别方法[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2515-2521. |
[13] | 敬超, 全育涛, 陈艳. 基于多层感知机-注意力模型的功耗预测算法[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2646-2655. |
[14] | 张硕, 孙国凯, 庄园, 冯小雨, 王敬之. 面向区块链节点分析的eclipse攻击动态检测方法[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2428-2436. |
[15] | 林进浩, 罗川, 李天瑞, 陈红梅. 基于跨尺度注意力网络的胸部疾病分类方法[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2712-2719. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||