基于关系网络和vision Transformer的跨域小样本分类模型

doi:10.11772/j.issn.1001-9081.2023121852

《计算机应用》唯一官方网站

• • 下一篇

基于关系网络和vision Transformer的跨域小样本分类模型

严一钦¹,罗川²,李天瑞³,陈红梅⁴

1. 四川大学
2. 四川大学计算机学院
3. 西南交通大学信息科学与技术学院，成都 610031；
4. 西南交通大学

收稿日期:2024-01-04 修回日期:2024-03-13 发布日期:2024-04-28 出版日期:2024-04-28
通讯作者: 严一钦
基金资助:
国家自然科学基金项目;国家自然科学基金项目;四川省自然科学基金项目

Cross-domain few-shot classification model based on relation network and vision Transformer

严 yiqinyanChuan Luo²,LI Tian-ruiHongmei Chen

Received:2024-01-04 Revised:2024-03-13 Online:2024-04-28 Published:2024-04-28
Contact: 严 yiqinyan

摘要/Abstract

摘要： 针对小样本学习模型在数据域存在偏移时分类准确度不高的问题，提出了一种基于关系网络和vision Transformer的跨域小样本图像分类模型ReViT(Relation Vision Transformer)。首先，引入vision Transformer作为特征提取器，经过预训练的深层神经网络解决了浅层神经网络的特征表达能力不足的问题；其次，以浅层卷积网络作为任务适配器提升模型的知识迁移能力，并基于关系网络和通道注意力机制构建非线性分类器，将特征提取器和任务适配器进行特征融合，从而增强模型的泛化能力；最后，采取“预训练-元学习-微调-元测试”的四阶段学习策略训练模型，通过迁移学习与元学习的有效融合，进一步提升ReViT的跨域分类性能。实验结果表明，ReViT在面对跨域小样本分类问题上有良好的性能，在Meta-Dataset数据集的域内场景下和域外场景下的分类准确度相较于次优的模型分别提升了5.82和1.17个百分点，在BCDFSL(Broader study of Cross-Domian Few-Shot Learning)数据集的三个子问题EuroSat(European Satellite data)、CropDiease和ISIC(International Skin Imaging Collaboration)的5-way 5-shot上相较于次优的模型分别提升了1.00、1.54和2.43个百分点，在EuroSat、CropDiease和ISIC的5-way 20-shot上相较于次优的模型分别提升了0.13、0.97和3.40个百分点，在CropDiease的5-way 50-shot上相较于次优的模型提升了0.36个百分点。ReViT能在样本量稀少的图像分类任务上保持良好的准确率，在卫星图像识别，人类皮肤病识别和农作物病害识别等实际应用中能够提高系统的效率。

关键词: 关键词: 小样本学习, 关系网络, 跨域学习, 元学习, 图像分类

Abstract: Aiming at the problem of poor classification accuracy of few-shot learning models in the domain shift, a cross-domain few-shot model based on Relation network and Vision Transformer ReViT (Relation Vision Transformer) was proposed. First, vision Transformer was introduced as a feature extractor, and the pre-trained deep neural network solves the problem of insufficient feature expression ability. Then, a shallow convolutional network was used as a task adapter to enhance the knowledge transfer ability of the model, and a nonlinear classifier was constructed based on the Relation Network and the channel attention mechanism. Finally, a four-stage learning strategy of "pre-training meta-training fine-tuning meta-testing" was adopted to train the model, which further improved the cross-domain classification performance of ReViT. The experimental results show that ReViT has good performance in dealing with the cross-domain few-shot classification problem, and the classification accuracies under in-domain scenarios and out-of-domain scenarios in the Meta-Dataset dataset are improved by 5.82 and 1.17 percentage points, respectively. ReViT is improved by 1.00, 1.54 and 2.43 percentage points on the 5-way 5-shot for the three sub-problems EuroSat(European Satellite data), CropDiease and ISIC(International Skin Imaging Collaboration) of the BCDFSL(Broader study of Cross-Domian Few-Shot Learning) dataset, respectively. ReViT is improved by 0.13, 0.97, and 3.40 percentage points on the 5-way 20-shot for EuroSat, CropDiease, and ISIC, respectively. ReViT is improved by 0.36 percentage point improvement on the 5-way 50-shot for CropDiease. The good classification results show that ReViT can have applications in image classification tasks with sparse sample size, such as satellite image recognition, human skin disease recognition and crop disease recognition.

Key words: Keywords: few-shot learning, relation network, cross-domain, meta learning, image classification

中图分类号:

TP391

严一钦罗川李天瑞陈红梅. 基于关系网络和vision Transformer的跨域小样本分类模型[J]. 计算机应用, DOI: 10.11772/j.issn.1001-9081.2023121852.

严 yiqinyan Chuan Luo LI Tian-rui Hongmei Chen. Cross-domain few-shot classification model based on relation network and vision Transformer[J]. Journal of Computer Applications, DOI: 10.11772/j.issn.1001-9081.2023121852.

[1]	肖斌, 杨模, 汪敏, 秦光源, 李欢. 独立性视角下的相频融合领域泛化方法[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1002-1009.
[2]	陈彤, 位纪伟, 何仕远, 宋井宽, 杨阳. 基于自适应攻击强度的对抗训练方法[J]. 《计算机应用》唯一官方网站, 2024, 44(1): 94-100.
[3]	周晓敏, 滕飞, 张艺. 基于元网络的自动国际疾病分类编码模型[J]. 《计算机应用》唯一官方网站, 2023, 43(9): 2721-2726.
[4]	王辉, 李建红. 基于Transformer的三维模型小样本识别方法[J]. 《计算机应用》唯一官方网站, 2023, 43(6): 1750-1758.
[5]	王彬, 向甜, 吕艺东, 王晓帆. 基于NSGA‑Ⅱ的自适应多尺度特征通道分组优化算法[J]. 《计算机应用》唯一官方网站, 2023, 43(5): 1401-1408.
[6]	李振亮, 李波. 基于矩阵分解的卷积神经网络改进方法[J]. 《计算机应用》唯一官方网站, 2023, 43(3): 685-691.
[7]	文凯, 薛晓, 季娟. 面向复杂图像分类的共享转换矩阵胶囊网络[J]. 《计算机应用》唯一官方网站, 2023, 43(11): 3411-3417.
[8]	申志军, 穆丽娜, 高静, 史远航, 刘志强. 细粒度图像分类综述[J]. 《计算机应用》唯一官方网站, 2023, 43(1): 51-60.
[9]	蔡淳豪, 李建良. 小样本问题下培训弱教师网络的模型蒸馏模型[J]. 《计算机应用》唯一官方网站, 2022, 42(9): 2652-2658.
[10]	魏佳璇, 杜世康, 于志轩, 张瑞生. 图像分类中的白盒对抗攻击技术综述[J]. 《计算机应用》唯一官方网站, 2022, 42(9): 2732-2741.
[11]	韩亚茹, 闫连山, 姚涛. 基于元学习的深度哈希检索算法[J]. 《计算机应用》唯一官方网站, 2022, 42(7): 2015-2021.
[12]	任炜, 白鹤翔. 基于全局与局部标签关系的多标签图像分类方法[J]. 《计算机应用》唯一官方网站, 2022, 42(5): 1383-1390.
[13]	季长清, 高志勇, 秦静, 汪祖民. 基于卷积神经网络的图像分类算法综述[J]. 《计算机应用》唯一官方网站, 2022, 42(4): 1044-1049.
[14]	许仁杰, 刘宝弟, 张凯, 刘伟锋. 基于贝叶斯权函数的模型无关元学习算法[J]. 《计算机应用》唯一官方网站, 2022, 42(3): 708-712.
[15]	李艳, 郭劼, 范斌. 元学习的不确定性特征构建及初步分析[J]. 《计算机应用》唯一官方网站, 2022, 42(2): 343-348.

基于关系网络和vision Transformer的跨域小样本分类模型

Cross-domain few-shot classification model based on relation network and vision Transformer

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics