《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (6): 1750-1758.DOI: 10.11772/j.issn.1001-9081.2022060952

• CCF第37届中国计算机应用大会 (CCF NCCA 2022) • 上一篇    下一篇

基于Transformer的三维模型小样本识别方法

王辉(), 李建红   

  1. 石家庄铁道大学 信息科学与技术学院,石家庄 050043
  • 收稿日期:2022-06-30 修回日期:2022-10-24 接受日期:2022-10-26 发布日期:2022-11-16 出版日期:2023-06-10
  • 通讯作者: 王辉
  • 作者简介:王辉(1983—),男,河北石家庄人,副教授,博士,CCF会员,主要研究方向:计算机图形学、人工智能Email:wangh@stdu.edu.cn
    李建红(1995—),女,河北衡水人,硕士,主要研究方向:计算机图形学、人工智能。
  • 基金资助:
    国家自然科学基金资助项目(61972267);河北省高等学校科学技术研究重点项目(ZD2021333)

Few-shot recognition method of 3D models based on Transformer

Hui WANG(), Jianhong LI   

  1. School of Information Science and Technology,Shijiazhuang Tiedao University,Shijiazhuang Hebei 050043,China
  • Received:2022-06-30 Revised:2022-10-24 Accepted:2022-10-26 Online:2022-11-16 Published:2023-06-10
  • Contact: Hui WANG
  • About author:LI Jianhong, born in 1995, M. S. Her research interests include computer graphics, artificial intelligence.
  • Supported by:
    National Natural Science Foundation of China(61972267);Key Project of Science and Technology Research of Hebei Province Colleges and Universities(ZD2021333)

摘要:

针对三维模型的分类问题,提出一种基于Transformer的三维(3D)模型小样本识别方法。首先,将支持和查询样本的3D点云模型输入特征提取模块中,以得到特征向量;然后,在Transformer模块中计算支持样本的注意力特征;最后,利用余弦相似性网络,计算查询与支持样本的关系分数。在ModelNet 40数据集上,相较于两层长短期记忆(Dual-LSTM)方法,所提方法的5-way 1-shot和5-way 5-shot的识别准确率分别提高了34.54和21.00个百分点;同时,所提方法在ShapeNet Core数据集上也取得了较高的准确率。实验结果表明,所提方法能够更准确地识别全新的3D模型类别。

关键词: 小样本识别, 三维模型, 注意力机制, 点云神经网络, 元学习

Abstract:

Aiming at the classification problems of Three-Dimensional (3D) models, a method of few-shot recognition of 3D models based on Transformer was proposed. Firstly, the 3D point cloud models of the support and query samples were fed into the feature extraction module to obtain feature vectors. Then, the attention features of the support samples were calculated in the Transformer module. Finally, the cosine similarity network was used to calculate the relation scores between the query samples and the support samples. On ModelNet 40 dataset, compared with the Dual-Long Short-Term Memory (Dual-LSTM) method, the proposed method has the recognition accuracy of 5-way 1-shot and 5-way 5-shot increased by 34.54 and 21.00 percentage points, respectively. At the same time, the proposed method also obtains high accuracy on ShapeNet Core dataset. Experimental results show that the proposed method can recognize new categories of 3D models more accurately.

Key words: few-shot recognition, Three-Dimensional (3D) model, attention mechanism, point cloud neural network, meta-learning

中图分类号: