Journal of Computer Applications ›› 2025, Vol. 45 ›› Issue (5): 1671-1676.DOI: 10.11772/j.issn.1001-9081.2024050572
• Multimedia computing and computer simulation • Previous Articles
Kai CHEN, Hailiang YE, Feilong CAO()
Received:
2024-05-09
Revised:
2024-06-24
Accepted:
2024-06-26
Online:
2024-07-25
Published:
2025-05-10
Contact:
Feilong CAO
About author:
CHEN Kai, born in 1998, M. S. candidate. His research interests include deep learning, computer vision.Supported by:
通讯作者:
曹飞龙
作者简介:
陈凯(1998—),男,浙江杭州人,硕士研究生,主要研究方向:深度学习、计算机视觉基金资助:
CLC Number:
Kai CHEN, Hailiang YE, Feilong CAO. Classification algorithm for point cloud based on local-global interaction and structural Transformer[J]. Journal of Computer Applications, 2025, 45(5): 1671-1676.
陈凯, 叶海良, 曹飞龙. 基于局部-全局交互与结构Transformer的点云分类算法[J]. 《计算机应用》唯一官方网站, 2025, 45(5): 1671-1676.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2024050572
算法 | 输入格式 | 输入点数 | mAcc/% | OA/% |
---|---|---|---|---|
PointNet[2017] | 坐标 | 1 024 | 86.0 | 89.2 |
PointNet++[2017] | 坐标+法向量 | 5 000 | — | 91.9 |
RS-CNN[2019] | 坐标 | 1 024 | — | 92.9 |
DGCNN[2019] | 坐标 | 1 024 | 90.2 | 92.9 |
KPConv[2019] | 坐标 | 6 800 | — | 92.9 |
DRNet [2021] | 坐标 | 1 024 | — | 93.1 |
PRA-Net[2021] | 坐标 | 1 024 | 90.6 | 93.2 |
PCT [2021] | 坐标 | 1 024 | — | 93.2 |
CT [2021] | 坐标 | 1 024 | 90.8 | 93.1 |
Point-BERT[2022] | 坐标 | 1 024 | — | 93.2 |
PatchFormer[2022] | 坐标 | 1 024 | — | 93.2 |
CSANet[2022] | 坐标 | 1 024 | 89.9 | 92.8 |
LFT-Net[2023] | 坐标+法向量 | 1 024 | 89.7 | 93.2 |
AGConv[2023] | 坐标 | 1 024 | 90.7 | 93.4 |
LGSTNet | 坐标 | 1 024 | 90.8 | 93.6 |
Tab. 1 Classification performance comparison on ModelNet40 dataset
算法 | 输入格式 | 输入点数 | mAcc/% | OA/% |
---|---|---|---|---|
PointNet[2017] | 坐标 | 1 024 | 86.0 | 89.2 |
PointNet++[2017] | 坐标+法向量 | 5 000 | — | 91.9 |
RS-CNN[2019] | 坐标 | 1 024 | — | 92.9 |
DGCNN[2019] | 坐标 | 1 024 | 90.2 | 92.9 |
KPConv[2019] | 坐标 | 6 800 | — | 92.9 |
DRNet [2021] | 坐标 | 1 024 | — | 93.1 |
PRA-Net[2021] | 坐标 | 1 024 | 90.6 | 93.2 |
PCT [2021] | 坐标 | 1 024 | — | 93.2 |
CT [2021] | 坐标 | 1 024 | 90.8 | 93.1 |
Point-BERT[2022] | 坐标 | 1 024 | — | 93.2 |
PatchFormer[2022] | 坐标 | 1 024 | — | 93.2 |
CSANet[2022] | 坐标 | 1 024 | 89.9 | 92.8 |
LFT-Net[2023] | 坐标+法向量 | 1 024 | 89.7 | 93.2 |
AGConv[2023] | 坐标 | 1 024 | 90.7 | 93.4 |
LGSTNet | 坐标 | 1 024 | 90.8 | 93.6 |
算法 | mAcc/% | OA/% |
---|---|---|
PointNet[2017] | 63.4 | 68.2 |
PointNet++[2017] | 75.4 | 77.9 |
DGCNN[2019] | 73.6 | 78.1 |
MVTN[2021] | — | 82.8 |
DRNet[2021] | 78.0 | 80.3 |
CT[2021] | 83.1 | 85.5 |
PointFormer[2022] | 78.9 | 81.1 |
Point-BERT[2022] | — | 83.1 |
PointMLP[2022] | 83.9 | 85.4 |
RepSurf-U[2022] | 83.1 | 86.0 |
GLSCN[2023] | 84.1 | 85.8 |
Point-PN[2023] | — | 87.1 |
LGSTNet | 86.5 | 87.5 |
Tab. 2 Classification performance comparison on ScanObjectNN dataset
算法 | mAcc/% | OA/% |
---|---|---|
PointNet[2017] | 63.4 | 68.2 |
PointNet++[2017] | 75.4 | 77.9 |
DGCNN[2019] | 73.6 | 78.1 |
MVTN[2021] | — | 82.8 |
DRNet[2021] | 78.0 | 80.3 |
CT[2021] | 83.1 | 85.5 |
PointFormer[2022] | 78.9 | 81.1 |
Point-BERT[2022] | — | 83.1 |
PointMLP[2022] | 83.9 | 85.4 |
RepSurf-U[2022] | 83.1 | 86.0 |
GLSCN[2023] | 84.1 | 85.8 |
Point-PN[2023] | — | 87.1 |
LGSTNet | 86.5 | 87.5 |
模型 | 局部-全局交互框架 | 结构Transformer | OA/% | mAcc/% | |
---|---|---|---|---|---|
局部特征分支 | 全局特征分支 | ||||
A | √ | 86.0 | 84.5 | ||
B | √ | 80.7 | 77.8 | ||
C | √ | √ | 86.7 | 85.2 | |
D | √ | √ | √ | 87.5 | 86.5 |
Tab. 3 Module ablation experimental results
模型 | 局部-全局交互框架 | 结构Transformer | OA/% | mAcc/% | |
---|---|---|---|---|---|
局部特征分支 | 全局特征分支 | ||||
A | √ | 86.0 | 84.5 | ||
B | √ | 80.7 | 77.8 | ||
C | √ | √ | 86.7 | 85.2 | |
D | √ | √ | √ | 87.5 | 86.5 |
算法 | 参数量/106 | 吞吐量/(shape· | GFLOPs | OA/% |
---|---|---|---|---|
PointNet | 3.47 | 518 | 0.45 | 68.2 |
PointNet++ | 1.74 | 29 | 4.03 | 77.9 |
DGCNN | 1.81 | 104 | 2.43 | 78.1 |
CT | 22.91 | 16 | 12.69 | 85.5 |
PointFormer | 3.99 | 94 | 3.48 | 81.1 |
PointMLP | 12.60 | 19 | 15.70 | 85.5 |
LGSTNet | 4.96 | 151 | 3.36 | 87.5 |
Tab. 4 Complexity comparison of different algorithms on ScanObjectNN dataset
算法 | 参数量/106 | 吞吐量/(shape· | GFLOPs | OA/% |
---|---|---|---|---|
PointNet | 3.47 | 518 | 0.45 | 68.2 |
PointNet++ | 1.74 | 29 | 4.03 | 77.9 |
DGCNN | 1.81 | 104 | 2.43 | 78.1 |
CT | 22.91 | 16 | 12.69 | 85.5 |
PointFormer | 3.99 | 94 | 3.48 | 81.1 |
PointMLP | 12.60 | 19 | 15.70 | 85.5 |
LGSTNet | 4.96 | 151 | 3.36 | 87.5 |
k | mAcc/% | OA/% | k | mAcc/% | OA/% |
---|---|---|---|---|---|
8 | 81.8 | 84.0 | 20 | 84.8 | 86.7 |
12 | 83.3 | 85.4 | 24 | 84.8 | 86.6 |
16 | 86.5 | 87.5 |
Tab. 5 Influence of k on performance on ScanObjectNN dataset
k | mAcc/% | OA/% | k | mAcc/% | OA/% |
---|---|---|---|---|---|
8 | 81.8 | 84.0 | 20 | 84.8 | 86.7 |
12 | 83.3 | 85.4 | 24 | 84.8 | 86.6 |
16 | 86.5 | 87.5 |
i | mAcc/% | OA/% | 参数量/106 |
---|---|---|---|
3 | 84.6 | 85.6 | 1.40 |
4 | 86.5 | 87.5 | 4.96 |
5 | 85.6 | 87.5 | 18.89 |
Tab. 6 Influence of i on performance on ScanObjectNN dataset
i | mAcc/% | OA/% | 参数量/106 |
---|---|---|---|
3 | 84.6 | 85.6 | 1.40 |
4 | 86.5 | 87.5 | 4.96 |
5 | 85.6 | 87.5 | 18.89 |
1 | ZHENG C, YAN X, ZHANG H, et al. Beyond 3D Siamese tracking: a motion-centric paradigm for 3D single object tracking in point clouds[C]// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 8101-8110. |
2 | CHEN X, MA H, WAN J, et al. Multi-view 3D object detection network for autonomous driving[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 6526-6534. |
3 | TU C, TAKEUCHI E, CARBALLO A, et al. Point cloud compression for 3D LiDAR sensor using recurrent neural network with residual blocks[C]// Proceedings of the 2019 International Conference on Robotics and Automation. Piscataway: IEEE, 2019: 3274-3280. |
4 | SU H, MAJI S, KALOGERAKIS E, et al. Multi-view convolutional neural networks for 3D shape recognition[C]// Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2015: 945-953. |
5 | MATURANA D, SCHERER S. VoxNet: a 3D convolutional neural network for real-time object recognition[C]// Proceedings of the 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway: IEEE, 2015: 922-928. |
6 | HAMDI A, GIANCOLA S, GHANEM B. MVTN: multi-view transformation network for 3D shape recognition[C]// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 1-11. |
7 | QI C R, SU H, MO K, et al. PointNet: deep learning on point sets for 3D classification and segmentation[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 77-85. |
8 | QI C R, YI L, SU H, et al. PointNet++: deep hierarchical feature learning on point sets in a metric space[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2017: 5105-5114. |
9 | MA X, QIN C, YOU H, et al. Rethinking network design and local geometry in point cloud: a simple residual MLP framework[EB/OL]. [2024-10-22].. |
10 | ZHANG R, WANG L, WANG Y, et al. Starting from non-parametric networks for 3D point cloud analysis[C]// Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2023: 5344-5353. |
11 | WEI M, WEI Z, ZHOU H, et al. AGConv: adaptive graph convolution on 3D point clouds[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(8): 9374-9392. |
12 | THOMAS H, QI C R, DESCHAUD J E, et al. KPConv: flexible and deformable convolution for point clouds[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 6410-6419. |
13 | LIU Y, FAN B, XIANG S, et al. Relation-shape convolutional neural network for point cloud analysis[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 8887-8896. |
14 | SIMONOVSKY M, KOMODAKIS N. Dynamic edge-conditioned filters in convolutional neural networks on graphs[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 29-38. |
15 | DU Z, YE H, CAO F. A novel local-global graph convolutional method for point cloud semantic segmentation[J]. IEEE Transactions on Neural Networks and Learning Systems, 2024, 35(4): 4798-4812. |
16 | WANG Y, SUN Y, LIU Z, et al. Dynamic graph CNN for learning on point clouds[J]. ACM Transactions on Graphics, 2019, 38(5): No.146. |
17 | QIU S, ANWAR S, BARNES N. Dense-resolution network for point cloud classification and segmentation[C]// Proceedings of the 2021 IEEE Winter Conference on Applications of Computer Vision. Piscataway: IEEE, 2021: 3812-3821. |
18 | RAN H, LIU J, WANG C. Surface representation for point clouds[C]// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 18920-18930. |
19 | LIANG J, DU Z, LIANG J, et al. Long and short-range dependency graph structure learning framework on point cloud[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(12): 14975-14989. |
20 | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2017: 6000-6010. |
21 | BROWN T B, MANN B, RYDER N, et al. Language models are few-shot learners[C]// Proceedings of the 34th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2020: 1877-1901. |
22 | DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: Transformers for image recognition at scale[EB/OL]. [2024-10-22].. |
23 | WANG W, XIE E, LI X, et al. Pyramid vision Transformer: a versatile backbone for dense prediction without convolutions[C]// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 548-558. |
24 | WU H, XIAO B, CODELLA N, et al. CVT: introducing convolutions to vision Transformers[C]// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 22-31. |
25 | ZHAO H, JIANG L, JIA J, et al. Point Transformer[C]// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 16239-16248. |
26 | GUO M H, CAI J X, LIU Z N, et al. PCT: point cloud Transformer[J]. Computational Visual Media, 2021, 7(2): 187-199. |
27 | WANG G, ZHAI Q, LIU H. Cross self-attention network for 3D point cloud[J]. Knowledge-Based Systems, 2022, 247: No.108769. |
28 | YU X, TANG L, RAO Y, et al. Point-BERT: pre-training 3D point cloud Transformers with masked point modeling[C]// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 19291-19300. |
29 | CHENG S, CHEN X, HE X, et al. PRA-Net: point relation-aware network for 3D point cloud analysis[J]. IEEE Transactions on Image Processing, 2021, 30: 4436-4448. |
30 | CHEN Y, YANG Z, ZHENG X, et al. PointFormer: a dual perception attention-based network for point cloud classification[C]// Proceedings of the Asian Conference on Computer Vision, LNCS 13841. Cham: Springer, 2023: 432-449. |
31 | MAZUR K, LEMPITSKY V. Cloud Transformers: a universal approach to point cloud processing tasks[C]// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 10695-10704. |
32 | ZHOU W, ZHAO Y, XIAO Y, et al. TNPC: Transformer-based network for point cloud classification[J]. Expert Systems with Applications, 2024, 239: No.122438. |
33 | ZHANG C, WAN H, SHEN X, et al. PatchFormer: an efficient point Transformer with patch attention[C]// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 11789-11798. |
34 | GAO Y, LIU X, LI J, et al. LFT-Net: local feature Transformer network for point clouds analysis[J]. IEEE Transactions on Intelligent Transportation Systems, 2023, 24(2): 2158-2168. |
35 | LAI X, LIU J, JIANG L, et al. Stratified Transformer for 3D point cloud segmentation[C]// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 8490-8499. |
36 | LIU Z, LIN Y, CAO Y, et al. Swin Transformer: hierarchical vision Transformer using shifted windows[C]// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 9992-10002. |
37 | WU K, PENG H, CHEN M, et al. Rethinking and improving relative position encoding for Vision Transformer[C]// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 10013-10021. |
38 | SI C, YU W, ZHOU P, et al. Inception Transformer[C]// Proceedings of the 36th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2022: 23495-23509. |
39 | WU Z, SONG S, KHOSLA A, et al. 3D ShapeNets: a deep representation for volumetric shapes[C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2015: 1912-1920. |
40 | UY M A, PHAM Q H, HUA B S, et al. Revisiting point cloud classification: a new benchmark dataset and classification model on real-world data[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 1588-1597. |
41 | VAN DER MAATEN L, HINTON G. Visualizing data using t-SNE[J]. Journal of Machine Learning Research, 2008, 9: 2579-2605. |
[1] | Dan WANG, Wenhao ZHANG, Lijuan PENG. Channel estimation of reconfigurable intelligent surface assisted communication system based on deep learning [J]. Journal of Computer Applications, 2025, 45(5): 1613-1618. |
[2] | Sijie NIU, Yuliang LIU. Auxiliary diagnostic method for retinopathy based on dual-branch structure with knowledge distillation [J]. Journal of Computer Applications, 2025, 45(5): 1410-1414. |
[3] | Wenpeng WANG, Yinchang QIN, Wenxuan SHI. Review of unsupervised deep learning methods for industrial defect detection [J]. Journal of Computer Applications, 2025, 45(5): 1658-1670. |
[4] | Xueying LI, Kun YANG, Guoqing TU, Shubo LIU. Adversarial sample generation method for time-series data based on local augmentation [J]. Journal of Computer Applications, 2025, 45(5): 1573-1581. |
[5] | Yang ZHOU, Hui LI. Remote sensing image building extraction network based on dual promotion of semantic and detailed features [J]. Journal of Computer Applications, 2025, 45(4): 1310-1316. |
[6] | Lihu PAN, Shouxin PENG, Rui ZHANG, Zhiyang XUE, Xuzhen MAO. Video anomaly detection for moving foreground regions [J]. Journal of Computer Applications, 2025, 45(4): 1300-1309. |
[7] | Yiding WANG, Zehao WANG, Yaoli LI, Shaoqing CAI, Yuan YUAN. Multi-scale 2D-Adaboost microscopic image recognition algorithm of Chinese medicinal materials powder [J]. Journal of Computer Applications, 2025, 45(4): 1325-1332. |
[8] | Zhenhua XUE, Qiang LI, Chao HUANG. Vision foundation model-driven pixel-level image anomaly detection method [J]. Journal of Computer Applications, 2025, 45(3): 823-831. |
[9] | Ruilong CHEN, Tao HU, Youjun BU, Peng YI, Xianjun HU, Wei QIAO. Stacking ensemble adversarial defense method for encrypted malicious traffic detection model [J]. Journal of Computer Applications, 2025, 45(3): 864-871. |
[10] | Zirong HONG, Guangqing BAO. Review of radar automatic target recognition based on ensemble learning [J]. Journal of Computer Applications, 2025, 45(2): 371-382. |
[11] | Zhongwei ZHANG, Jun WANG, Shudong LIU, Zhiheng WANG. Object detection in remote sensing image based on multi-scale feature fusion and weighted boxes fusion [J]. Journal of Computer Applications, 2025, 45(2): 633-639. |
[12] | Miaolei DENG, Yupei KAN, Chuanchuan SUN, Haihang XU, Shaojun FAN, Xin ZHOU. Summary of network intrusion detection systems based on deep learning [J]. Journal of Computer Applications, 2025, 45(2): 453-466. |
[13] | Songsen YU, Zhifan LIN, Guopeng XUE, Jianyu XU. Lightweight large-format tile defect detection algorithm based on improved YOLOv8 [J]. Journal of Computer Applications, 2025, 45(2): 647-654. |
[14] | Danni DING, Bo PENG, Xi WU. VPNet: fatty liver ultrasound image classification method inspired by ventral pathway [J]. Journal of Computer Applications, 2025, 45(2): 662-669. |
[15] | Yan LI, Guanhua YE, Yawen LI, Meiyu LIANG. Enterprise ESG indicator prediction model based on richness coordination technology [J]. Journal of Computer Applications, 2025, 45(2): 670-676. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||