基于Transformer的三维模型小样本识别方法

doi:10.11772/j.issn.1001-9081.2022060952

《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (6): 1750-1758.DOI: 10.11772/j.issn.1001-9081.2022060952

所属专题： CCF第37届中国计算机应用大会 (CCF NCCA 2022)

• CCF第37届中国计算机应用大会 (CCF NCCA 2022) • 上一篇下一篇

基于Transformer的三维模型小样本识别方法

王辉(), 李建红

石家庄铁道大学信息科学与技术学院，石家庄 050043

收稿日期:2022-06-30 修回日期:2022-10-24 接受日期:2022-10-26 发布日期:2022-11-16 出版日期:2023-06-10
通讯作者: 王辉
作者简介:王辉（1983—），男，河北石家庄人，副教授，博士，CCF会员，主要研究方向：计算机图形学、人工智能Email：wangh@stdu.edu.cn
李建红（1995—），女，河北衡水人，硕士，主要研究方向：计算机图形学、人工智能。
基金资助:
国家自然科学基金资助项目(61972267);河北省高等学校科学技术研究重点项目(ZD2021333)

Few-shot recognition method of 3D models based on Transformer

Hui WANG(), Jianhong LI

School of Information Science and Technology，Shijiazhuang Tiedao University，Shijiazhuang Hebei 050043，China

Received:2022-06-30 Revised:2022-10-24 Accepted:2022-10-26 Online:2022-11-16 Published:2023-06-10
Contact: Hui WANG
About author:LI Jianhong， born in 1995， M. S. Her research interests include computer graphics， artificial intelligence.
Supported by:
National Natural Science Foundation of China(61972267);Key Project of Science and Technology Research of Hebei Province Colleges and Universities(ZD2021333)

摘要/Abstract

摘要：

针对三维模型的分类问题，提出一种基于Transformer的三维（3D）模型小样本识别方法。首先，将支持和查询样本的3D点云模型输入特征提取模块中，以得到特征向量；然后，在Transformer模块中计算支持样本的注意力特征；最后，利用余弦相似性网络，计算查询与支持样本的关系分数。在ModelNet 40数据集上，相较于两层长短期记忆（Dual-LSTM）方法，所提方法的5-way 1-shot和5-way 5-shot的识别准确率分别提高了34.54和21.00个百分点；同时，所提方法在ShapeNet Core数据集上也取得了较高的准确率。实验结果表明，所提方法能够更准确地识别全新的3D模型类别。

关键词: 小样本识别, 三维模型, 注意力机制, 点云神经网络, 元学习

Abstract:

Aiming at the classification problems of Three-Dimensional （3D） models， a method of few-shot recognition of 3D models based on Transformer was proposed. Firstly， the 3D point cloud models of the support and query samples were fed into the feature extraction module to obtain feature vectors. Then， the attention features of the support samples were calculated in the Transformer module. Finally， the cosine similarity network was used to calculate the relation scores between the query samples and the support samples. On ModelNet 40 dataset， compared with the Dual-Long Short-Term Memory （Dual-LSTM） method， the proposed method has the recognition accuracy of 5-way 1-shot and 5-way 5-shot increased by 34.54 and 21.00 percentage points， respectively. At the same time， the proposed method also obtains high accuracy on ShapeNet Core dataset. Experimental results show that the proposed method can recognize new categories of 3D models more accurately.

Key words: few-shot recognition, Three-Dimensional （3D) model, attention mechanism, point cloud neural network, meta-learning

中图分类号:

TP391.41

王辉, 李建红. 基于Transformer的三维模型小样本识别方法[J]. 计算机应用, 2023, 43(6): 1750-1758.

Hui WANG, Jianhong LI. Few-shot recognition method of 3D models based on Transformer[J]. Journal of Computer Applications, 2023, 43(6): 1750-1758.

图/表 19

图1 基于Transformer的三维模型小样本识别方法的框架

Fig. 1 Framework of few-shot recognition method of 3D models based on Transformer

图2 特征提取模块框架

Fig. 2 Framework of feature extraction module

图3 Transformer模块结构

Fig. 3 Structure of Transformer module

图4 ModelNet 40数据集的部分模型可视化

Fig. 4 Visualization of some models in ModelNet 40 dataset

图5 ShapeNet Core数据集的部分模型可视化

Fig. 5 Visualization of some models in ShapeNet Core dataset

图6 不同点数的点云模型

Fig. 6 Point cloud models with different numbers of points

图7 不同采样点数的预测准确率变化

Fig. 7 Prediction accuracy varying with different sampling point numbers

表1 不同采样点数1-shot实验的准确率 ( %)

Tab. 1 Accuracy of 1-shot experiments at different sampling point numbers

采样点数	ModelNet 40		ShapeNet Core.v2		ShapeNet Core_normal
采样点数	3-way	5-way	3-way	5-way	3-way	5-way
256	83.28	78.25	80.12	79.06	85.99	80.86
512	86.59	79.06	80.28	79.63	96.05	84.39
1 024	87.37	80.86	81.75	81.51	83.96	82.25
2 048	87.21	81.32	80.63	79.96	78.68	81.01

图8 不同采样点数时损失值随迭代次数变化的曲线

Fig. 8 Curves of loss changing with the number of iterations at different sampling point numbers

表2 本文方法在ModelNet 40和ShapeNet Core数据集上1-shot实验的准确率 (%)

Tab. 2 Accuracies of the proposed method of 1-shot experiments on ModelNet 40 and ShapeNet Core datasets

数据集	3-way	5-way
ModelNet 40	87.37	80.86
ShapeNet Core.v2	81.75	81.51
ShapeNet Core_normal	83.96	82.25

图9 不同数据集上损失值随C值变化曲线

Fig. 9 Curves of loss changing with C value on different datasets

表3 ModelNet 40和ShapeNet Core_normal数据集上5-way K-shot实验的准确率 (%)

Tab. 3 Accuracies of 5-way K-shot experiments on ModelNet 40 and ShapeNet Core_normal datasets

数据集	K=1	K=2	K=5	K=10
ModelNet 40	80.86	81.25	83.77	84.21
ShapeNet Core_normal	82.25	83.96	85.31	85.76

表4 不同λ值的识别准确率 ( %)

Tab. 4 Recognition accuracies at different λ values

数据集	$λ = 0$	$λ = 0.000 1$	$λ = 0.01$	$λ = 0.1$	$λ = 1$
ShapeNet Core.v2	79.33	82.32	80.57	79.18	79.62
ShapeNet Core_normal	80.44	85.31	83.75	81.64	81.28
ModelNet 40	78.53	83.77	81.19	80.43	80.01

表4 不同λ值的识别准确率 ( %)

Tab. 4 Recognition accuracies at different λ values

数据集	$λ = 0$	$λ = 0.000 1$	$λ = 0.01$	$λ = 0.1$	$λ = 1$
ShapeNet Core.v2	79.33	82.32	80.57	79.18	79.62
ShapeNet Core_normal	80.44	85.31	83.75	81.64	81.28
ModelNet 40	78.53	83.77	81.19	80.43	80.01

表5 不同深度学习方法在ModelNet 40数据集上的小样本识别准确率 (%)

Tab. 5 Few-shot recognition accuracies of different deep learning methods on ModelNet 40 dataset

方法	5-way		10-way
方法	10-shot	20-shot	10-shot	20-shot
DGCNN+cTree^［40］	60.00	65.70	48.50	53.00
PointNet+cTree^［40］	63.20	68.90	49.20	50.10
PointNet+Jigsaw^［41］	66.50	69.20	56.90	66.50
本文方法	84.21	81.53	80.32	80.75

表6 不同三维模型小样本识别方法在ModelNet 40数据集上的5-way准确率 ( %)

Tab. 6 Five-way accuracies of different few-shot recognition methods of 3D models on ModelNet 40 dataset

方法	5-way 1-shot	5-way 5-shot
Dual-LSTM^［16］	46.32	62.77
关系网络	70.27	72.13
无Transformer网络	35.56	36.92
本文方法	80.86	83.77

图10 关系分数矩阵

Fig. 10 Relation score matrix

图11 3-way 1-shot 1-query实验结果

Fig. 11 Experimental results of 3-way 1-shot 1-query

图12 5-way 1-shot 1-query实验结果

Fig. 12 Experimental results of 5-way 1-shot 1-query

图13 失败结果示例

Fig. 13 Example of failure results

参考文献 41

1	WANG Y K， XU C M， LIU C， et al. Instance credibility inference for few-shot learning［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 12833-12842. 10.1109/cvpr42600.2020.01285
2	赵凯琳，靳小龙，王元卓. 小样本学习研究综述［J］. 软件学报， 2021， 32（2）： 349-369. 10.13328/j.cnki.jos.006138
	ZHAO K L， JIN X L， WANG Y Z. Survey on few-shot learning ［J］. Journal of Software， 2021， 32（2）： 349-369. 10.13328/j.cnki.jos.006138
3	YANG J C， GUO X L， LI Y， et al. A survey of few-shot learning in smart agriculture： developments， applications， and challenges ［J］. Plant Methods， 2022， 18： No.28. 10.1186/s13007-022-00866-2
4	SA L B， YU C C， MA X Q， et al. Attentive fine-grained recognition for cross-domain few-shot classification［J］. Neural Computing and Applications， 2022， 34（6）： 4733-4746. 10.1007/s00521-021-06627-x
5	孙文赟，金忠，赵海涛，等. 基于深度特征增广的跨域小样本人脸欺诈检测算法［J］. 计算机科学， 2021， 48（2）： 330-336. 10.11896/jsjkx.200100020
	SUN W Y， JIN Z， ZHAO H T， et al. Cross-domain few-shot face spoofing detection method based on deep feature augmentation ［J］. Computer Science， 2021， 48（2）： 330-336. 10.11896/jsjkx.200100020
6	SHOME D， KAR T. FedAffect： few-shot federated learning for facial expression recognition［C］// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2021： 4151-4158. 10.1109/iccvw54120.2021.00463
7	尹力，周琪. 基于小样本数据和深度残差网络的月度供电量预测研究［J］. 计算机与数字工程， 2022， 50（2）： 448-452. 10.3969/j.issn.1672-9722.2022.02.042
	YIN L， ZHOU Q. Research on monthly power supply forecasting based on small sample data and deep residual network ［J］. Computer and Digital Engineering， 2022， 50（2）： 448-452. 10.3969/j.issn.1672-9722.2022.02.042
8	董阳，潘海为，崔倩娜，等. 面向多模态磁共振脑瘤图像的小样本分割方法［J］. 计算机应用， 2021， 41（4）： 1049-1054. 10.11772/j.issn.1001-9081.2020081388
	DONG Y， PAN H W， CUI Q N， et al. Few-shot segmentation method for multi-modal magnetic resonance images of brain tumor ［J］. Journal of Computer Applications， 2021， 41（4）： 1049-1054. 10.11772/j.issn.1001-9081.2020081388
9	刘颖，雷研博，范九伦，等. 基于小样本学习的图像分类技术综述［J］. 自动化学报， 2021， 47（2）： 297-315. 10.16383/j.aas.c190720
	LIU Y， LEI Y B， FAN J L， et al. Survey on image classification technology based on small sample learning ［J］. Acta Automatica Sinica， 2021， 47（2）： 297-315. 10.16383/j.aas.c190720
10	MA J W， XIE H C， HAN G X， et al. Partner-assisted learning for few-shot image classification ［C］// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2021： 10553-10562. 10.1109/iccv48922.2021.01040
11	YANG F Y， WANG R P， CHEN X L. SEGA： semantic guided attention on visual prototype for few-shot learning ［C］// Proceedings of the 2022 IEEE/CVF Winter Conference on Applications of Computer Vision. Piscataway： IEEE， 2022： 1586-1596. 10.1109/wacv51458.2022.00165
12	WERTHEIMER D， TANG L， HARIHARAN B. Few-shot classification with feature map reconstruction networks［C］// Proceedings of the 2021 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2021： 8012-8021. 10.1109/cvpr46437.2021.00792
13	ADAMKIEWICZ M， CHEN T， CACCAVALE A， et al. Vision-only robot navigation in a neural radiance world［J］. IEEE Robotics and Automation Letters， 2022， 7（2）： 4606-4613. 10.1109/lra.2022.3150497
14	王贺鹏，李志斌，王立. 自动驾驶仿真的虚拟交通信号系统分析及实现［J］. 汽车实用技术， 2020（7）： 34-37.
	WANG H P， LI Z B， WANG L. Analysis and implementation of virtual traffic signal system for autopilot simulation［J］. Automobile Applied Technology， 2020（7）： 34-37.
15	陈涛，丘恩华，孔吉宏，等. 基于CAD的虚拟现实技术在水电站仿真系统的应用［J］. 计算机与数字工程， 2021， 49（4）： 856-861. 10.3969/j.issn.1672-9722.2021.04.047
	CHEN T， QIU E H， KONG J H， et al. Application of virtual reality technology based on CAD in hydropower station simulation system ［J］. Computer and Digital Engineering， 2021， 49（4）： 856-861. 10.3969/j.issn.1672-9722.2021.04.047
16	NIE J， XU N， ZHOU M， et al. 3D model classification based on few-shot learning ［J］. Neurocomputing， 2020， 398： 539-546. 10.1016/j.neucom.2019.03.105
17	WU Z R， SONG S R， KHOSLA A， et al. 3D ShapeNets： a deep representation for volumetric shapes ［C］// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2015： 1912-1920. 10.1109/cvpr.2015.7298801
18	DENG Y， YANG J L， TONG X. Deformed implicit field： modeling 3D shapes with learned dense correspondence［C］// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2021： 10281-10291. 10.1109/cvpr46437.2021.01015
19	QI C R， YI L， SU H， et al. PointNet++： deep hierarchical feature learning on point sets in a metric space ［C］// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2017： 5105-5114.
20	LIU B， KANG H， LI H X， et al. Few-shot open-set recognition using meta-learning［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 8795-8804. 10.1109/cvpr42600.2020.00882
21	BAIK S， CHOI M， CHOI J， et al. Meta-learning with adaptive hyperparameters［C］// Proceedings of the 34th International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2020： 20755-20765.
22	BAIK S， HONG S， LEE K M. Learning to forget for meta-learning［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 2376-2384. 10.1109/cvpr42600.2020.00245
23	BAIK S， CHOI J， KIM H， et al. Meta-learning with task-adaptive loss function for few-shot learning ［C］// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2021： 9445-9454. 10.1109/iccv48922.2021.00933
24	GIDARIS S， KOMODAKIS N. Dynamic few-shot visual learning without forgetting［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 4367-4375. 10.1109/cvpr.2018.00459
25	GIDARIS S， KOMODAKIS N. Generating classification weights with GNN Denoising Autoencoders for few-shot learning［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2019： 21-30. 10.1109/cvpr.2019.00011
26	HARIHARAN B， GIRSHICK R. Low-shot visual recognition by shrinking and hallucinating features ［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017： 3037-3046. 10.1109/iccv.2017.328
27	MUNKHDALAI T， YUAN X D， MEHRI S. Rapid adaptation with conditionally shifted neurons［C］// Proceedings of the 35th International Conference on Machine Learning. New York： JMLR.org， 2018： 3664-3673.
28	SCHWARTZ E， KARLINSKY L， SHTOK J， et al. Δ-encoder： an effective sample synthesis method for few-shot object recognition［C］// Proceedings of the 32nd International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2018： 2850-2860.
29	GARCIA V， BRUNA J. Few-shot learning with graph neural networks［EB/OL］. （2018-02-20）［2022-04-12］..
30	YANG L， LI L L， ZHANG Z L， et al. DPGN： distribution propagation graph network for few-shot learning ［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 13387-13396. 10.1109/cvpr42600.2020.01340
31	HAN X F， LEUNG T， JIA Y Q， et al. MatchNet： unifying feature and metric learning for patch-based matching ［C］// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2015： 3279-3286. 10.1109/cvpr.2015.7298948
32	LI W B， WANG L， XU J L， et al. Revisiting local descriptor based image-to-class measure for few-shot learning［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2019： 7253-7260. 10.1109/cvpr.2019.00743
33	LI H Y， EIGEN D， DODGE S， et al. Finding task-relevant features for few-shot learning by category traversal［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2019： 1-10. 10.1109/cvpr.2019.00009
34	QI C R， SU H， MO K C， et al. PointNet： deep learning on point sets for 3D classification and segmentation ［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 77-85. 10.1109/cvpr.2017.16
35	REN M Y， TRIANTAFILLOU E， RAVI S， et al. Meta-learning for semi-supervised few-shot classification［EB/OL］. （2018-03-02）［2022-04-12］. .
36	SUNG F， YANG Y X， ZHANG L， et al. Learning to compare： relation network for few-shot learning ［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 1199-1208. 10.1109/cvpr.2018.00131
37	TANG S X， CHEN D P， BAI L， et al. Mutual CRF-GNN for few-shot learning ［C］// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2021： 2329-2339. 10.1109/cvpr46437.2021.00236
38	DOERSCH C， GUPTA A， ZISSERMAN A. CrossTransformers： spatially-aware few-shot transfer［C］// Proceedings of the 34th International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2020：21981-21993.
39	YE H J， HU H X， ZHAN D C， et al. Few-shot learning via embedding adaptation with set-to-set functions ［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 8805-8814. 10.1109/cvpr42600.2020.00883
40	SHARMA C， KAUL M. Self-supervised few-shot learning on point clouds ［C］// Proceedings of the 34th International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2020： 7212-7221.
41	SAUDER J， SIEVERS B. Self-supervised deep learning on point clouds by reconstructing space［C］// Proceedings of the 33rd International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2019： 12962-12972.

[1]	黄云川, 江永全, 黄骏涛, 杨燕. 基于元图同构网络的分子毒性预测[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2964-2969.
[2]	秦璟, 秦志光, 李发礼, 彭悦恒. 基于概率稀疏自注意力神经网络的重性抑郁疾患诊断[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2970-2974.
[3]	李力铤, 华蓓, 贺若舟, 徐况. 基于解耦注意力机制的多变量时序预测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2732-2738.
[4]	赵志强, 马培红, 黑新宏. 基于双重注意力机制的人群计数方法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2886-2892.
[5]	薛凯鹏, 徐涛, 廖春节. 融合自监督和多层交叉注意力的多模态情感分析网络[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2387-2392.
[6]	汪雨晴, 朱广丽, 段文杰, 李书羽, 周若彤. 基于交互注意力机制的心理咨询文本情感分类模型[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2393-2399.
[7]	高鹏淇, 黄鹤鸣, 樊永红. 融合坐标与多头注意力机制的交互语音情感识别[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2400-2406.
[8]	李钟华, 白云起, 王雪津, 黄雷雷, 林初俊, 廖诗宇. 基于图像增强的低照度人脸检测[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2588-2594.
[9]	莫尚斌, 王文君, 董凌, 高盛祥, 余正涛. 基于多路信息聚合协同解码的单通道语音增强[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2611-2617.
[10]	刘丽, 侯海金, 王安红, 张涛. 基于多尺度注意力的生成式信息隐藏算法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2102-2109.
[11]	徐松, 张文博, 王一帆. 基于时空信息的轻量视频显著性目标检测网络[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2192-2199.
[12]	李大海, 王忠华, 王振东. 结合空间域和频域信息的双分支低光照图像增强网络[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2175-2182.
[13]	魏文亮, 王阳萍, 岳彪, 王安政, 张哲. 基于光照权重分配和注意力的红外与可见光图像融合深度学习模型[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2183-2191.
[14]	熊武, 曹从军, 宋雪芳, 邵云龙, 王旭升. 基于多尺度混合域注意力机制的笔迹鉴别方法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2225-2232.
[15]	李欢欢, 黄添强, 丁雪梅, 罗海峰, 黄丽清. 基于多尺度时空图卷积网络的交通出行需求预测[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2065-2072.

基于Transformer的三维模型小样本识别方法

Few-shot recognition method of 3D models based on Transformer

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 19

参考文献 41

相关文章 15

编辑推荐

Metrics