Journal of Computer Applications

    Next Articles

Agricultural user identification recognition method integrating image-text and large language model

SHANG Yimeng1, CHI Zenglin1, ZHANG Hongming1, LIU Bin1, HU Guoqiang2, NIU Dangdang1   

  1. 1.College of Information Engineering, Northwest A & F University 2. Network & Education Technology Center, Northwest A&F University
  • Received:2025-09-17 Revised:2025-12-24 Online:2026-01-21 Published:2026-01-21
  • About author:SHANG Yimeng, born in 2001, M. S. candidate. His research interests include deep learning, multimodal learning. CHI Zenglin, born in 1997, M. S. candidate. His research interests include combinatorial optimization. ZHANG Hongming, born in 1979, Ph. D., professor. His research interests include artificial intelligence, smart agriculture. LIU Bin, born in 1981, Ph. D., professor. His research interests include computer vision, parallel algorithms for deep learning. HU Guoqiang, born in 1981, M. S., senior engineer. His research interests include information network technology application, education informatization. NIU Dangdang, born in 1990, Ph. D., associate professor. His research interests include smart agriculture, natural language processing, combinatorial optimization.
  • Supported by:
    National Natural Science Foundation of China (62206222); National Key Research and Development Program of China (2020YFD1100601)

融合图文与大语言模型的农业用户身份识别方法

尚毅蒙1,迟增林1,张宏鸣1,刘斌1,胡国强2,牛当当1   

  1. 1.西北农林科技大学 信息工程学院 2.西北农林科技大学 网络与教育技术中心
  • 通讯作者: 牛当当
  • 作者简介:尚毅蒙(2001—),男,甘肃庆阳人,硕士研究生,主要研究方向:深度学习、多模态学习;迟增林(1997—),男,山东青岛人,硕士研究生,主要研究方向:组合优化;张宏鸣(1979—),男,内蒙古赤峰人,教授,博士,CCF会员,主要研究方向:人工智能、智慧农业;刘斌(1981—),男,陕西渭南人,教授,博士,CCF会员,主要研究方向:计算机视觉、深度学习并行算法;胡国强(1981—):男,陕西周至人,高级工程师,硕士,主要研究方向:信息网络技术应用、教育信息化;牛当当(1990—),男,陕西周至人,副教授,博士,主要研究方向:智慧农业、自然语言处理、组合优化。
  • 基金资助:
    国家自然科学基金项目(62206222);国家重点研发计划课题(2020YFD110060)

Abstract: The cold-start problem in recommender systems significantly impairs the quality and precision of personalized services due to the absence of historical interaction data. To alleviate this issue, user avatars and nicknames obtained from login recommendation systems were utilized, and a multimodal deep learning framework, termed ResBERT-MMNet, was constructed to facilitate user identity recognition. Visual and textual features were fused within this framework to enhance user profiling performance. A multi-layer stacked bidirectional residual cross-modal attention mechanism was designed to promote semantic alignment and interaction between image and text modalities. Furthermore, to address the risk of incorrect predictions produced by multimodal models under low-confidence conditions, an auxiliary identity inference module based on a large language model was incorporated. Specifically, the DeepSeek-R1-Distill-Qwen-7B model was fine-tuned using LoRA to infer agricultural user identity types solely from nickname information, and its outputs were employed as a supplementary decision-making component to improve system robustness. Experimental results demonstrate that the proposed multimodal fusion strategy outperforms existing approaches across multiple agricultural user classification tasks. After integration with the fine-tuned large language model, the overall framework achieves an accuracy of 69.73% and an F1-score of 70.06%, indicating strong practical applicability. The proposed method provides essential prior user profiles during the cold-start stage of agricultural recommender systems and offers a feasible technical solution for enhanced user profiling and precision agriculture services.

Key words: recommender system, cold-start, multimodal learning, cross-modal attention, residual structure, Low-Rank Adaptation (LoRA)

摘要: 推荐系统冷启动阶段难以依赖历史交互数据进行有效建模,从而严重制约了个性化服务的质量与精准度。针对这一问题,以用户登录推荐系统即可获取到的头像和昵称作为研究对象,提出一种结合用户头像与昵称信息的多模态深度学习模型ResBERT-MMNet((ResNet-BERT MultiModal Network)),并设计了多层堆叠的双向残差跨模态注意力机制实现图文语义交互,增强对用户身份的理解能力。进一步地,针对多模态模型在低置信度场景下可能产生错误预测的局限性,引入了基于大语言模型(LLM)的农业用户身份类型预测,使用DeepSeek-R1-Distill-Qwen-7B模型进行低秩适配(LoRA)微调,从而能够根据用户昵称推理出用户的身份类型,最后将LLM的计算结果作为补充决策模块提升系统整体鲁棒性。在自建数据集上的实验结果表明,设计的多模态模型融合策略在多个农业用户类别的识别任务中相比其他融合策略取得了更加优异的表现。结合微调的LLM,识别精确率提升到69.73%,F1值达到70.06%,表现出较强的实际适用性,为农业领域推荐系统在冷启动阶段提供先验数据,同时也为用户画像构建和精准服务贡献了技术解决思路。

关键词: 推荐系统, 冷启动, 多模态学习, 跨模态注意力, 残差结构, 低秩适配

CLC Number: