《计算机应用》唯一官方网站 ›› 2025, Vol. 45 ›› Issue (3): 840-848.DOI: 10.11772/j.issn.1001-9081.2024091297
收稿日期:
2024-09-10
修回日期:
2024-11-13
接受日期:
2024-11-18
发布日期:
2024-11-29
出版日期:
2025-03-10
通讯作者:
马传香
作者简介:
陈维(2000—),男,湖北武汉人,硕士研究生,主要研究方向:多模态学习、数据融合基金资助:
Wei CHEN1,2, Changyong SHI1,2, Chuanxiang MA1,2()
Received:
2024-09-10
Revised:
2024-11-13
Accepted:
2024-11-18
Online:
2024-11-29
Published:
2025-03-10
Contact:
Chuanxiang MA
About author:
CHEN Wei, born in 2000, M. S. candidate. His research interests include multi-modal learning, data fusion.Supported by:
摘要:
现有的基于深度学习模型的农作物病害识别方法依赖特定农作物病害图像数据集进行图像特征学习,而忽视了文本特征在辅助图像特征学习中的重要性。为了更有效地提高模型对农作物病害图像的特征提取能力及病害识别能力,提出一种基于对比语言-图像预训练和多模态数据融合的农作物病害识别方法(CDR-CLIP)。首先,构建高质量的病害识别图像-文本对数据集,利用文本信息增强农作物病害图像的特征表示;其次,利用多模态融合策略有效结合文本特征与图像特征,以加强模型对病害的判别能力;最后,针对性地设计预训练和微调策略,从而优化模型在特定农作物病害识别任务中的表现。实验结果表明,在PlantVillage和AI Challenger 2018农作物病害数据集上,CDR-CLIP的病害识别准确率分别达到99.31%和87.66%,F1值分别达到99.04%和87.56%;在PlantDoc农作物病害数据集上,CDR-CLIP的平均精度均值mAP@0.5达到51.10%,展现出CDR-CLIP强大的性能优势。
中图分类号:
陈维, 施昌勇, 马传香. 基于多模态数据融合的农作物病害识别方法[J]. 计算机应用, 2025, 45(3): 840-848.
Wei CHEN, Changyong SHI, Chuanxiang MA. Crop disease recognition method based on multi-modal data fusion[J]. Journal of Computer Applications, 2025, 45(3): 840-848.
数据集 | 图像数 | 文本数 | 实际有效的图像-文本对数 |
---|---|---|---|
PlantVillage | 55 446 | 55 446 | 55 326 |
AI Challenger 2018 | 36 258 | 36 258 | 35 200 |
PlantDoc | 2 580 | 2 580 | 2 302 |
表1 数据集统计信息
Tab. 1 Dataset statistic information
数据集 | 图像数 | 文本数 | 实际有效的图像-文本对数 |
---|---|---|---|
PlantVillage | 55 446 | 55 446 | 55 326 |
AI Challenger 2018 | 36 258 | 36 258 | 35 200 |
PlantDoc | 2 580 | 2 580 | 2 302 |
方法 | PlantVillage | AI Challenger 2018 | ||||||
---|---|---|---|---|---|---|---|---|
准确率 | 精确率 | 召回率 | F1值 | 准确率 | 精确率 | 召回率 | F1值 | |
SMLP_ResNet18[ | 99.32 | 99.10 | 86.93 | — | — | — | ||
文献[ | 99.32 | 99.10 | 98.78 | 98.92 | 86.93 | 84.15 | 83.42 | 83.60 |
DCNN[ | 99.48 | — | — | 99.23 | — | — | — | — |
VGG-ICNN[ | 99.16 | — | — | — | — | — | — | — |
EWPRC-ResNet-t[ | — | — | — | — | 87.42 | 85.36 | 84.23 | 84.79 |
CDR-CLIP | 99.31 | 99.10 | 98.98 | 99.04 | 87.66 | 88.26 | 87.66 | 87.56 |
表2 不同方法在PlantVillage和AI Challenger 2018数据集上的性能对比 (%)
Tab. 2 Performance comparison of different methods on PlantVillage and AI Challenger 2018 datasets
方法 | PlantVillage | AI Challenger 2018 | ||||||
---|---|---|---|---|---|---|---|---|
准确率 | 精确率 | 召回率 | F1值 | 准确率 | 精确率 | 召回率 | F1值 | |
SMLP_ResNet18[ | 99.32 | 99.10 | 86.93 | — | — | — | ||
文献[ | 99.32 | 99.10 | 98.78 | 98.92 | 86.93 | 84.15 | 83.42 | 83.60 |
DCNN[ | 99.48 | — | — | 99.23 | — | — | — | — |
VGG-ICNN[ | 99.16 | — | — | — | — | — | — | — |
EWPRC-ResNet-t[ | — | — | — | — | 87.42 | 85.36 | 84.23 | 84.79 |
CDR-CLIP | 99.31 | 99.10 | 98.98 | 99.04 | 87.66 | 88.26 | 87.66 | 87.56 |
方法 | mAP@0.5 | mAP@0.5:0.95 |
---|---|---|
Faster-RCNN-MobileNet[ | 32.80 | — |
YOLOR-Light-v1[ | 42.70 | 32.70 |
TL-SE-ResNeXt-101[ | 47.37 | — |
文献[ | 48.20 | 33.30 |
DETR[ | 48.90 | 46.30 |
CDR-CLIP | 51.10 | 33.90 |
表3 不同方法在PlantDoc数据集上的性能对比 (%)
Tab. 3 Performance comparison of different methods on PlantDoc dataset
方法 | mAP@0.5 | mAP@0.5:0.95 |
---|---|---|
Faster-RCNN-MobileNet[ | 32.80 | — |
YOLOR-Light-v1[ | 42.70 | 32.70 |
TL-SE-ResNeXt-101[ | 47.37 | — |
文献[ | 48.20 | 33.30 |
DETR[ | 48.90 | 46.30 |
CDR-CLIP | 51.10 | 33.90 |
数据集 | 文本模态 | 准确率 | 精确率 | 召回率 | F1值 |
---|---|---|---|---|---|
PlantVillage | × | 98.30 | 98.33 | 98.30 | 98.30 |
√ | 99.31 | 99.10 | 98.98 | 99.04 | |
AI Challenger 2018 | × | 83.17 | 84.29 | 83.17 | 82.81 |
√ | 87.66 | 88.26 | 87.66 | 87.56 |
表4 PlantVillage和AI Challenger 2018数据集上的消融实验结果 (%)
Tab. 4 Results of ablation experiment on PlantVillage and AI Challenger 2018 datasets
数据集 | 文本模态 | 准确率 | 精确率 | 召回率 | F1值 |
---|---|---|---|---|---|
PlantVillage | × | 98.30 | 98.33 | 98.30 | 98.30 |
√ | 99.31 | 99.10 | 98.98 | 99.04 | |
AI Challenger 2018 | × | 83.17 | 84.29 | 83.17 | 82.81 |
√ | 87.66 | 88.26 | 87.66 | 87.56 |
1 | 康丽,袁建清,高睿,等. 高光谱成像的水稻稻瘟病早期分级检测[J]. 光谱学与光谱分析, 2021, 41(3):898-902. |
KANG L, YUAN J Q, GAO R, et al. Early detection and identification of rice blast based on hyperspectral image [J]. Spectroscopy and Spectral Analysis, 2021, 41(3): 898-902. | |
2 | JOHANNES A, PICON A, ALVAREZ-GILA A, et al. Automatic plant disease diagnosis using mobile capture devices, applied on a wheat use case [J]. Computers and Electronics in Agriculture, 2017, 138: 200-209. |
3 | AGARWAL M, GUPTA S K, BISWAS K K. Development of efficient CNN model for tomato crop disease identification [J]. Sustainable Computing: Informatics and Systems, 2020, 28: No.100407. |
4 | 赵恒谦,杨屹峰,刘泽龙,等. 农作物叶片病害迁移学习分步识别方法[J]. 测绘通报, 2021(7):34-38. |
ZHAO H Q, YANG Y F, LIU Z L, et al. Step-by-step identification method of crop leaf diseases based on transfer learning [J]. Bulletin of Surveying and Mapping, 2021(7): 34-38. | |
5 | SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition [EB/OL]. [2024-09-04]. . |
6 | HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition [C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 770-778. |
7 | 杜海顺,张春海,安文昊,等. 基于多层信息融合和显著性特征增强的农作物病害识别[J]. 农业机械学报, 2023, 54(7):214-222. |
DU H S, ZHANG C H, AN W H, et al. Crop disease recognition based on multi-layer information fusion and saliency feature enhancement [J]. Transactions of the Chinese Society for Agricultural Machinery, 2023, 54(7): 214-222. | |
8 | SCHWARZ SCHULER J P, ROMANI S, ABDEL-NASSER M, et al. Color-aware two-branch DCNN for efficient plant disease classification [J]. MENDEL, 2022, 28(1): 55-62. |
9 | THAKUR P S, SHEOREY T, OJHA A. VGG-ICNN: a lightweight CNN model for crop disease identification [J]. Multimedia Tools and Applications, 2023, 82(1): 497-520. |
10 | 姜红花,杨祥海,丁睿柔,等. 基于改进ResNet18的苹果叶部病害多分类算法研究[J]. 农业机械学报, 2023, 54(4):295-303. |
JIANG H H, YANG X H, DING R R, et al. Identification of apple leaf diseases based on improved ResNet18 [J]. Transactions of the Chinese Society for Agricultural Machinery, 2023, 54(4): 295-303. | |
11 | 黄林生,罗耀武,杨小冬,等. 基于注意力机制和多尺度残差网络的农作物病害识别[J]. 农业机械学报, 2021, 52(10):264-271. |
HUANG L S, LUO Y W, YANG X D, et al. Crop disease recognition based on attention mechanism and multi-scale residual network [J]. Transactions of the Chinese Society for Agricultural Machinery, 2021, 52(10): 264-271. | |
12 | 孙文斌,王荣,高荣华,等. 基于可见光谱和改进注意力的农作物病害识别[J]. 光谱学与光谱分析, 2022, 42(5):1572-1580. |
SUN W B, WANG R, GAO R H, et al. Crop disease recognition based on visible spectrum and improved attention module [J]. Spectroscopy and Spectral Analysis, 2022, 42(5): 1572-1580. | |
13 | 肖天赐,陈燕红,李永可,等. 基于改进通道注意力机制的农作物病害识别模型研究[J]. 江苏农业科学, 2023, 51(24):168-175. |
XIAO T C, CHEN Y H, LI Y K, et al. Study on crop disease identification model based on improved channel attention mechanism [J]. Jiangsu Agricultural Sciences, 2023, 51(24): 168-175. | |
14 | RADFORD A, KIM J W, HALLACY C, et al. Learning transferable visual models from natural language supervision [C]// Proceedings of the 38th International Conference on Machine Learning. New York: JMLR.org, 2021: 8748-8763. |
15 | LI J, LI D, SAVARESE S, et al. BLIP-2: bootstrapping language-image pre-training with frozen image encoders and large language models [C]// Proceedings of the 40th International Conference on Machine Learning. New York: JMLR.org, 2023: 19730-19742. |
16 | LIU H, LI C, WU Q, et al. Visual instruction tuning [C]// Proceedings of the 37th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2024: 34892-34916. |
17 | HUGHES D P, SALATHÉ M. An open access repository of images on plant health to enable the development of mobile disease diagnostics [EB/OL]. [2023-11-23]. . |
18 | SINGH D, JAIN N, JAIN P, et al. PlantDoc: a dataset for visual plant disease detection [C]// Proceedings of the 7th ACM IKDD CoDS and 25th COMAD. New York: ACM, 2020: 249-253. |
19 | ILHARCO G, WORTSMAN M, WIGHTMAN R, et al. OpenCLIP[CP/OL]. [2023-07-23]. . |
20 | SUN Q, YU Q, CUI Y, et al. Emu: generative pretraining in multimodality [EB/OL]. [2024-07-30]. . |
21 | BIRD S. NLTK: the natural language toolkit [C]// Proceedings of the Interactive Presentation Sessions of 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2006: 69-72. |
22 | SCHUHMAN A, BEAUMONT R, VENCU R, et al. LAION-5B: an open large-scale dataset for training next generation image-text models [C]// Proceedings of the 36th Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2022: 25278-25294. |
23 | DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional Transformers for language understanding [C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Stroudsburg: ACL, 2019: 4171-4186. |
24 | DONG Y, CORDONNIER J B, LOUKAS A. Attention is not all you need: pure attention loses rank doubly exponentially with depth [C]// Proceedings of the 38th International Conference on Machine Learning. New York: JMLR.org, 2021: 2793-2803. |
25 | LIU Y, ZHANG X, DING J, et al. Knowledge-infused contrastive learning for urban imagery-based socioeconomic prediction [C]// Proceedings of the ACM Web Conference 2023. New York: ACM, 2023: 4150-4160. |
26 | LI T, XIN S, XI Y, et al. Predicting multi-level socioeconomic indicators from structural urban imagery [C]// Proceedings of the 31st ACM International Conference on Information and Knowledge Management. New York: ACM, 2022: 3282-3291. |
27 | LI Y, MAO H, GIRSHICK R, et al. Exploring plain vision transformer backbones for object detection [C]// Proceedings of the 2022 European Conference on Computer Vision, LNCS 13669. Cham: Springer, 2022: 280-296. |
28 | HE K, GKIOXARI G, DOLLÁR P, et al. Mask R-CNN [C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 2980-2988. |
29 | 高荣华,白强,王荣,等. 改进注意力机制的多叉树网络多作物早期病害识别方法[J]. 计算机科学, 2022, 49(6A):363-369. |
GAO R H, BAI Q, WANG R, et al. Multi-tree network multi-crop early disease recognition method based on improved attention mechanism [J]. Computer Science, 2022, 49(6A): 363-369. | |
30 | HUANG Q, WU X, WANG Q, et al. Knowledge distillation facilitates the lightweight and efficient plant diseases detection model [J]. Plant Phenomics, 2023, 5: No.0062. |
31 | 王东方,汪军. 基于迁移学习和残差网络的农作物病害分类[J]. 农业工程学报, 2021, 37(4):199-207. |
WANG D F, WANG J. Crop disease classification with transfer learning and residual networks [J]. Transactions of the Chinese Society of Agricultural Engineering, 2021, 37(4): 199-207. | |
32 | LEE H, AHN S. Improving the performance of object detection by preserving label distribution [J]. Mathematics, 2023, 11(21): No.4460. |
33 | CARION N, MASSA F, SYNNAEVE G, et al. End-to-end object detection with Transformers [C]// Proceedings of the 2020 European Conference on Computer Vision, LNCS 12346. Cham: Springer, 2020: 213-229. |
[1] | 何静, 沈阳, 谢润锋. 大语言模型幻觉现象的识别与优化[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 709-714. |
[2] | 徐月梅, 叶宇齐, 何雪怡. 大语言模型的偏见挑战:识别、评估与去除[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 697-708. |
[3] | 杨燕, 叶枫, 许栋, 张雪洁, 徐津. 融合大语言模型和提示学习的数字孪生水利知识图谱构建[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 785-793. |
[4] | 盛坤, 王中卿. 基于大语言模型和数据增强的通感隐喻分析[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 794-800. |
[5] | 鲁超峰, 陶冶, 文连庆, 孟菲, 秦修功, 杜永杰, 田云龙. 融合大语言模型和预训练模型的少量语料说话人-情感语音转换方法[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 815-822. |
[6] | 孙晨伟, 侯俊利, 刘祥根, 吕建成. 面向工程图纸理解的大语言模型提示生成方法[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 801-807. |
[7] | 董艳民, 林佳佳, 张征, 程程, 吴金泽, 王士进, 黄振亚, 刘淇, 陈恩红. 个性化学情感知的智慧助教算法设计与实践[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 765-772. |
[8] | 张学飞, 张丽萍, 闫盛, 侯敏, 赵宇博. 知识图谱与大语言模型协同的个性化学习推荐[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 773-784. |
[9] | 秦小林, 古徐, 李弟诚, 徐海文. 大语言模型综述与展望[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 685-696. |
[10] | 袁成哲, 陈国华, 李丁丁, 朱源, 林荣华, 钟昊, 汤庸. ScholatGPT:面向学术社交网络的大语言模型及智能应用[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 755-764. |
[11] | 王元龙, 刘亭华, 张虎. 基于跨模态对比学习的常识问答模型[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 732-738. |
[12] | 曹鹏, 温广琪, 杨金柱, 陈刚, 刘歆一, 季学纯. 面向测试用例生成的大模型高效微调方法[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 725-731. |
[13] | 蔡启健, 谭伟. 语义图增强的多模态推荐算法[J]. 《计算机应用》唯一官方网站, 2025, 45(2): 421-427. |
[14] | 杨晟, 李岩. 面向目标检测的对比知识蒸馏方法[J]. 《计算机应用》唯一官方网站, 2025, 45(2): 354-361. |
[15] | 严雪文, 黄章进. 基于对比学习的小样本图像分类方法[J]. 《计算机应用》唯一官方网站, 2025, 45(2): 383-391. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||