Few-shot recognition method of 3D models based on Transformer

doi:10.11772/j.issn.1001-9081.2022060952

Abstract

Abstract:

Aiming at the classification problems of Three-Dimensional （3D） models， a method of few-shot recognition of 3D models based on Transformer was proposed. Firstly， the 3D point cloud models of the support and query samples were fed into the feature extraction module to obtain feature vectors. Then， the attention features of the support samples were calculated in the Transformer module. Finally， the cosine similarity network was used to calculate the relation scores between the query samples and the support samples. On ModelNet 40 dataset， compared with the Dual-Long Short-Term Memory （Dual-LSTM） method， the proposed method has the recognition accuracy of 5-way 1-shot and 5-way 5-shot increased by 34.54 and 21.00 percentage points， respectively. At the same time， the proposed method also obtains high accuracy on ShapeNet Core dataset. Experimental results show that the proposed method can recognize new categories of 3D models more accurately.

Key words: few-shot recognition, Three-Dimensional （3D) model, attention mechanism, point cloud neural network, meta-learning

摘要：

针对三维模型的分类问题，提出一种基于Transformer的三维（3D）模型小样本识别方法。首先，将支持和查询样本的3D点云模型输入特征提取模块中，以得到特征向量；然后，在Transformer模块中计算支持样本的注意力特征；最后，利用余弦相似性网络，计算查询与支持样本的关系分数。在ModelNet 40数据集上，相较于两层长短期记忆（Dual-LSTM）方法，所提方法的5-way 1-shot和5-way 5-shot的识别准确率分别提高了34.54和21.00个百分点；同时，所提方法在ShapeNet Core数据集上也取得了较高的准确率。实验结果表明，所提方法能够更准确地识别全新的3D模型类别。

关键词: 小样本识别, 三维模型, 注意力机制, 点云神经网络, 元学习

CLC Number:

TP391.41

Hui WANG, Jianhong LI. Few-shot recognition method of 3D models based on Transformer[J]. Journal of Computer Applications, 2023, 43(6): 1750-1758.

王辉, 李建红. 基于Transformer的三维模型小样本识别方法[J]. 《计算机应用》唯一官方网站, 2023, 43(6): 1750-1758.

Figures/Tables 19

Fig. 1 Framework of few-shot recognition method of 3D models based on Transformer

Fig. 2 Framework of feature extraction module

Fig. 3 Structure of Transformer module

Fig. 4 Visualization of some models in ModelNet 40 dataset

Fig. 5 Visualization of some models in ShapeNet Core dataset

Fig. 6 Point cloud models with different numbers of points

Fig. 7 Prediction accuracy varying with different sampling point numbers

Tab. 1 Accuracy of 1-shot experiments at different sampling point numbers

采样点数	ModelNet 40		ShapeNet Core.v2		ShapeNet Core_normal
采样点数	3-way	5-way	3-way	5-way	3-way	5-way
256	83.28	78.25	80.12	79.06	85.99	80.86
512	86.59	79.06	80.28	79.63	96.05	84.39
1 024	87.37	80.86	81.75	81.51	83.96	82.25
2 048	87.21	81.32	80.63	79.96	78.68	81.01

Fig. 8 Curves of loss changing with the number of iterations at different sampling point numbers

Tab. 2 Accuracies of the proposed method of 1-shot experiments on ModelNet 40 and ShapeNet Core datasets

数据集	3-way	5-way
ModelNet 40	87.37	80.86
ShapeNet Core.v2	81.75	81.51
ShapeNet Core_normal	83.96	82.25

Fig. 9 Curves of loss changing with C value on different datasets

Tab. 3 Accuracies of 5-way K-shot experiments on ModelNet 40 and ShapeNet Core_normal datasets

数据集	K=1	K=2	K=5	K=10
ModelNet 40	80.86	81.25	83.77	84.21
ShapeNet Core_normal	82.25	83.96	85.31	85.76

Tab. 4 Recognition accuracies at different λ values

数据集	$λ = 0$	$λ = 0.000 1$	$λ = 0.01$	$λ = 0.1$	$λ = 1$
ShapeNet Core.v2	79.33	82.32	80.57	79.18	79.62
ShapeNet Core_normal	80.44	85.31	83.75	81.64	81.28
ModelNet 40	78.53	83.77	81.19	80.43	80.01

Tab. 4 Recognition accuracies at different λ values

数据集	$λ = 0$	$λ = 0.000 1$	$λ = 0.01$	$λ = 0.1$	$λ = 1$
ShapeNet Core.v2	79.33	82.32	80.57	79.18	79.62
ShapeNet Core_normal	80.44	85.31	83.75	81.64	81.28
ModelNet 40	78.53	83.77	81.19	80.43	80.01

Tab. 5 Few-shot recognition accuracies of different deep learning methods on ModelNet 40 dataset

方法	5-way		10-way
方法	10-shot	20-shot	10-shot	20-shot
DGCNN+cTree^［40］	60.00	65.70	48.50	53.00
PointNet+cTree^［40］	63.20	68.90	49.20	50.10
PointNet+Jigsaw^［41］	66.50	69.20	56.90	66.50
本文方法	84.21	81.53	80.32	80.75

Tab. 6 Five-way accuracies of different few-shot recognition methods of 3D models on ModelNet 40 dataset

方法	5-way 1-shot	5-way 5-shot
Dual-LSTM^［16］	46.32	62.77
关系网络	70.27	72.13
无Transformer网络	35.56	36.92
本文方法	80.86	83.77

Fig. 10 Relation score matrix

Fig. 11 Experimental results of 3-way 1-shot 1-query

Fig. 12 Experimental results of 5-way 1-shot 1-query

Fig. 13 Example of failure results

References 41

1	WANG Y K， XU C M， LIU C， et al. Instance credibility inference for few-shot learning［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 12833-12842. 10.1109/cvpr42600.2020.01285
2	赵凯琳，靳小龙，王元卓. 小样本学习研究综述［J］. 软件学报， 2021， 32（2）： 349-369. 10.13328/j.cnki.jos.006138
	ZHAO K L， JIN X L， WANG Y Z. Survey on few-shot learning ［J］. Journal of Software， 2021， 32（2）： 349-369. 10.13328/j.cnki.jos.006138
3	YANG J C， GUO X L， LI Y， et al. A survey of few-shot learning in smart agriculture： developments， applications， and challenges ［J］. Plant Methods， 2022， 18： No.28. 10.1186/s13007-022-00866-2
4	SA L B， YU C C， MA X Q， et al. Attentive fine-grained recognition for cross-domain few-shot classification［J］. Neural Computing and Applications， 2022， 34（6）： 4733-4746. 10.1007/s00521-021-06627-x
5	孙文赟，金忠，赵海涛，等. 基于深度特征增广的跨域小样本人脸欺诈检测算法［J］. 计算机科学， 2021， 48（2）： 330-336. 10.11896/jsjkx.200100020
	SUN W Y， JIN Z， ZHAO H T， et al. Cross-domain few-shot face spoofing detection method based on deep feature augmentation ［J］. Computer Science， 2021， 48（2）： 330-336. 10.11896/jsjkx.200100020
6	SHOME D， KAR T. FedAffect： few-shot federated learning for facial expression recognition［C］// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2021： 4151-4158. 10.1109/iccvw54120.2021.00463
7	尹力，周琪. 基于小样本数据和深度残差网络的月度供电量预测研究［J］. 计算机与数字工程， 2022， 50（2）： 448-452. 10.3969/j.issn.1672-9722.2022.02.042
	YIN L， ZHOU Q. Research on monthly power supply forecasting based on small sample data and deep residual network ［J］. Computer and Digital Engineering， 2022， 50（2）： 448-452. 10.3969/j.issn.1672-9722.2022.02.042
8	董阳，潘海为，崔倩娜，等. 面向多模态磁共振脑瘤图像的小样本分割方法［J］. 计算机应用， 2021， 41（4）： 1049-1054. 10.11772/j.issn.1001-9081.2020081388
	DONG Y， PAN H W， CUI Q N， et al. Few-shot segmentation method for multi-modal magnetic resonance images of brain tumor ［J］. Journal of Computer Applications， 2021， 41（4）： 1049-1054. 10.11772/j.issn.1001-9081.2020081388
9	刘颖，雷研博，范九伦，等. 基于小样本学习的图像分类技术综述［J］. 自动化学报， 2021， 47（2）： 297-315. 10.16383/j.aas.c190720
	LIU Y， LEI Y B， FAN J L， et al. Survey on image classification technology based on small sample learning ［J］. Acta Automatica Sinica， 2021， 47（2）： 297-315. 10.16383/j.aas.c190720
10	MA J W， XIE H C， HAN G X， et al. Partner-assisted learning for few-shot image classification ［C］// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2021： 10553-10562. 10.1109/iccv48922.2021.01040
11	YANG F Y， WANG R P， CHEN X L. SEGA： semantic guided attention on visual prototype for few-shot learning ［C］// Proceedings of the 2022 IEEE/CVF Winter Conference on Applications of Computer Vision. Piscataway： IEEE， 2022： 1586-1596. 10.1109/wacv51458.2022.00165
12	WERTHEIMER D， TANG L， HARIHARAN B. Few-shot classification with feature map reconstruction networks［C］// Proceedings of the 2021 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2021： 8012-8021. 10.1109/cvpr46437.2021.00792
13	ADAMKIEWICZ M， CHEN T， CACCAVALE A， et al. Vision-only robot navigation in a neural radiance world［J］. IEEE Robotics and Automation Letters， 2022， 7（2）： 4606-4613. 10.1109/lra.2022.3150497
14	王贺鹏，李志斌，王立. 自动驾驶仿真的虚拟交通信号系统分析及实现［J］. 汽车实用技术， 2020（7）： 34-37.
	WANG H P， LI Z B， WANG L. Analysis and implementation of virtual traffic signal system for autopilot simulation［J］. Automobile Applied Technology， 2020（7）： 34-37.
15	陈涛，丘恩华，孔吉宏，等. 基于CAD的虚拟现实技术在水电站仿真系统的应用［J］. 计算机与数字工程， 2021， 49（4）： 856-861. 10.3969/j.issn.1672-9722.2021.04.047
	CHEN T， QIU E H， KONG J H， et al. Application of virtual reality technology based on CAD in hydropower station simulation system ［J］. Computer and Digital Engineering， 2021， 49（4）： 856-861. 10.3969/j.issn.1672-9722.2021.04.047
16	NIE J， XU N， ZHOU M， et al. 3D model classification based on few-shot learning ［J］. Neurocomputing， 2020， 398： 539-546. 10.1016/j.neucom.2019.03.105
17	WU Z R， SONG S R， KHOSLA A， et al. 3D ShapeNets： a deep representation for volumetric shapes ［C］// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2015： 1912-1920. 10.1109/cvpr.2015.7298801
18	DENG Y， YANG J L， TONG X. Deformed implicit field： modeling 3D shapes with learned dense correspondence［C］// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2021： 10281-10291. 10.1109/cvpr46437.2021.01015
19	QI C R， YI L， SU H， et al. PointNet++： deep hierarchical feature learning on point sets in a metric space ［C］// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2017： 5105-5114.
20	LIU B， KANG H， LI H X， et al. Few-shot open-set recognition using meta-learning［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 8795-8804. 10.1109/cvpr42600.2020.00882
21	BAIK S， CHOI M， CHOI J， et al. Meta-learning with adaptive hyperparameters［C］// Proceedings of the 34th International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2020： 20755-20765.
22	BAIK S， HONG S， LEE K M. Learning to forget for meta-learning［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 2376-2384. 10.1109/cvpr42600.2020.00245
23	BAIK S， CHOI J， KIM H， et al. Meta-learning with task-adaptive loss function for few-shot learning ［C］// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2021： 9445-9454. 10.1109/iccv48922.2021.00933
24	GIDARIS S， KOMODAKIS N. Dynamic few-shot visual learning without forgetting［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 4367-4375. 10.1109/cvpr.2018.00459
25	GIDARIS S， KOMODAKIS N. Generating classification weights with GNN Denoising Autoencoders for few-shot learning［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2019： 21-30. 10.1109/cvpr.2019.00011
26	HARIHARAN B， GIRSHICK R. Low-shot visual recognition by shrinking and hallucinating features ［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017： 3037-3046. 10.1109/iccv.2017.328
27	MUNKHDALAI T， YUAN X D， MEHRI S. Rapid adaptation with conditionally shifted neurons［C］// Proceedings of the 35th International Conference on Machine Learning. New York： JMLR.org， 2018： 3664-3673.
28	SCHWARTZ E， KARLINSKY L， SHTOK J， et al. Δ-encoder： an effective sample synthesis method for few-shot object recognition［C］// Proceedings of the 32nd International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2018： 2850-2860.
29	GARCIA V， BRUNA J. Few-shot learning with graph neural networks［EB/OL］. （2018-02-20）［2022-04-12］..
30	YANG L， LI L L， ZHANG Z L， et al. DPGN： distribution propagation graph network for few-shot learning ［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 13387-13396. 10.1109/cvpr42600.2020.01340
31	HAN X F， LEUNG T， JIA Y Q， et al. MatchNet： unifying feature and metric learning for patch-based matching ［C］// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2015： 3279-3286. 10.1109/cvpr.2015.7298948
32	LI W B， WANG L， XU J L， et al. Revisiting local descriptor based image-to-class measure for few-shot learning［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2019： 7253-7260. 10.1109/cvpr.2019.00743
33	LI H Y， EIGEN D， DODGE S， et al. Finding task-relevant features for few-shot learning by category traversal［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2019： 1-10. 10.1109/cvpr.2019.00009
34	QI C R， SU H， MO K C， et al. PointNet： deep learning on point sets for 3D classification and segmentation ［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 77-85. 10.1109/cvpr.2017.16
35	REN M Y， TRIANTAFILLOU E， RAVI S， et al. Meta-learning for semi-supervised few-shot classification［EB/OL］. （2018-03-02）［2022-04-12］. .
36	SUNG F， YANG Y X， ZHANG L， et al. Learning to compare： relation network for few-shot learning ［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 1199-1208. 10.1109/cvpr.2018.00131
37	TANG S X， CHEN D P， BAI L， et al. Mutual CRF-GNN for few-shot learning ［C］// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2021： 2329-2339. 10.1109/cvpr46437.2021.00236
38	DOERSCH C， GUPTA A， ZISSERMAN A. CrossTransformers： spatially-aware few-shot transfer［C］// Proceedings of the 34th International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2020：21981-21993.
39	YE H J， HU H X， ZHAN D C， et al. Few-shot learning via embedding adaptation with set-to-set functions ［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 8805-8814. 10.1109/cvpr42600.2020.00883
40	SHARMA C， KAUL M. Self-supervised few-shot learning on point clouds ［C］// Proceedings of the 34th International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2020： 7212-7221.
41	SAUDER J， SIEVERS B. Self-supervised deep learning on point clouds by reconstructing space［C］// Proceedings of the 33rd International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2019： 12962-12972.

[1]	Yi ZHANG, Zhenmei WANG. circRNA-disease association prediction by two-stage fusion on graph auto-encoder [J]. Journal of Computer Applications, 2023, 43(6): 1979-1986.
[2]	Zhixiong ZHENG, Jianhua LIU, Shuihua SUN, Ge XU, Honghui LIN. Aspect-based sentiment analysis model fused with multi-window local information [J]. Journal of Computer Applications, 2023, 43(6): 1796-1802.
[3]	Ke FANG, Rong LIU, Chiyu WEI, Xinyue ZHANG, Yang LIU. Pedestrian fall detection algorithm in complex scenes [J]. Journal of Computer Applications, 2023, 43(6): 1811-1817.
[4]	Bin LU, Jielin LIU. Semantic segmentation for 3D point clouds based on feature enhancement [J]. Journal of Computer Applications, 2023, 43(6): 1818-1825.
[5]	Huibin ZHANG, Liping FENG, Yaojun HAO, Yining WANG. Ancient mural dynasty identification based on attention mechanism and transfer learning [J]. Journal of Computer Applications, 2023, 43(6): 1826-1832.
[6]	Kai ZHANG, Zhengchu QIN, Yue LIU, Xinyi QIN. Multi-learning behavior collaborated knowledge tracing model [J]. Journal of Computer Applications, 2023, 43(5): 1422-1429.
[7]	Zhengkai DING, Qiming FU, Jianping CHEN, You LU, Hongjie WU, Nengwei FANG, Bin XING. Ultra-short-term photovoltaic power prediction by deep reinforcement learning based on attention mechanism [J]. Journal of Computer Applications, 2023, 43(5): 1647-1654.
[8]	Hui LIU, Linyu ZHANG, Fugang WANG, Rujin HE. Object detection algorithm based on attention mechanism and context information [J]. Journal of Computer Applications, 2023, 43(5): 1557-1564.
[9]	Yang LIU, Zhiyang LU, Jun WANG, Jun SHI. Gibbs artifact removal algorithm for magnetic resonance imaging based on self-attention connection UNet [J]. Journal of Computer Applications, 2023, 43(5): 1606-1611.
[10]	Xiaohui HUANG, Kaiming YANG, Jiahao LING. Order dispatching by multi-agent reinforcement learning based on shared attention [J]. Journal of Computer Applications, 2023, 43(5): 1620-1624.
[11]	Jiagao WU, Shiwen ZHANG, Yudong JIANG, Linfeng LIU. Social-interaction GAN for pedestrian trajectory prediction based on state-refinement long short-term memory and attention mechanism [J]. Journal of Computer Applications, 2023, 43(5): 1565-1570.
[12]	Lifeng SHI, Zhengwei NI. Dialogue state tracking model based on slot correlation information extraction [J]. Journal of Computer Applications, 2023, 43(5): 1430-1437.
[13]	Ruilin JIANG, Renchao QIN. Multi-neural network malicious code detection model based on depthwise separable convolution [J]. Journal of Computer Applications, 2023, 43(5): 1527-1533.
[14]	Zhouhua ZHU, Qi QI. Automatic detection and recognition of electric vehicle helmet based on improved YOLOv5s [J]. Journal of Computer Applications, 2023, 43(4): 1291-1296.
[15]	Lu CHEN, Daoxi CHEN, Yiming LU, Weizhong LU. Handwritten mathematical expression recognition model based on attention mechanism and encoder-decoder [J]. Journal of Computer Applications, 2023, 43(4): 1297-1302.