《计算机应用》唯一官方网站 ›› 2025, Vol. 45 ›› Issue (3): 801-807.DOI: 10.11772/j.issn.1001-9081.2024101537

• 大模型前沿研究与典型应用 • 上一篇    下一篇

面向工程图纸理解的大语言模型提示生成方法

孙晨伟1, 侯俊利2, 刘祥根1, 吕建成1()   

  1. 1.四川大学 计算机学院,成都 610065
    2.西南电子设备研究所,成都 610036
  • 收稿日期:2024-10-15 修回日期:2024-12-20 接受日期:2024-12-26 发布日期:2025-02-07 出版日期:2025-03-10
  • 通讯作者: 吕建成
  • 作者简介:孙晨伟(2000—),男,山东济南人,硕士研究生,主要研究方向:自然语言处理、人工智能
    侯俊利(1970—),男,河南鹤壁人,高级工程师,博士,主要研究方向:人工智能
    刘祥根(1993—),男,四川成都人,副教授,博士,主要研究方向:自然语言处理
  • 基金资助:
    国家自然科学基金资助项目(62206192);国家重点研发计划项目(2024YFB3312503);四川省重大专项(2024ZDZX0003)

Large language model prompt generation method for engineering drawing understanding

Chenwei SUN1, Junli HOU2, Xianggen LIU1, Jiancheng LYU1()   

  1. 1.College of Computer Science,Sichuan University,Chengdu Sichuan 610065,China
    2.Southwest Institute of Electronic Equipment,Chengdu Sichuan 610036,China
  • Received:2024-10-15 Revised:2024-12-20 Accepted:2024-12-26 Online:2025-02-07 Published:2025-03-10
  • Contact: Jiancheng LYU
  • About author:SUN Chenwei, born in 2000, M. S. candidate. His research interests include natural language processing, artificial intelligence.
    HOU Junli, born in 1970, Ph. D., senior engineer. His research interests include artificial intelligence.
    LIU Xianggen, born in 1993, Ph. D., associate professor. His research interests include natural language processing.
  • Supported by:
    National Natural Science Foundation of China(62206192);National Key Research and Development Program of China(2024YFB3312503);Major Science and Technology Project of Sichuan Province(2024ZDZX0003)

摘要:

近年来,大语言模型(LLM)在自然语言处理、计算机视觉等领域都展示出卓越的语言理解和对话能力。然而,它们常常会在专业领域中产生与正确答案不相符的推理结果。这为LLM在精确和准确的决策任务中的应用带来了重大挑战。为了解决这个问题,提出一种规则指导的后提示词大模型(PP-LLM)生成方法。该方法通过生成后提示词可以将原问题转化为2个更容易解决的子问题,从而引入专家知识、降低任务学习难度。具体来说,使用知识指导的特定规则将监督数据集的输出部分转化为后提示词与输出部分的组合。PP-LLM方法不改变模型的训练和推理过程,并且不增加计算量。实验结果表明,PP-LLM方法显著提高了推理结果的准确性,缩小了模型预测与实际答案之间的差距,与不使用所提方法的结果相比,F1值、ROUGE(Recall-Oriented Understudy for Gisting Evaluation)等都有显著提高。可见,以上工作提高了LLM在专业应用上的可靠性,并为LLM生成技术提供了新的思路。

关键词: 工程图纸, 大语言模型, 数据增强, 多模态, 提示词

Abstract:

In recent years, Large Language Models (LLMs) have demonstrated excellent language understanding and dialogue capabilities in fields such as natural language processing and computer vision. However, they can produce inference results that are inconsistent with the correct answers in professional fields. This situation brings significant challenges to the application of LLMs in precise and accurate decision-making tasks. To solve this problem, a rule-guided Post Prompt of Large Language Model (PP-LLM) generation method was proposed. In this method, by generating post prompts, the original problem was transformed into two sub-problems that are easier to solve, thereby achieving the purposes of introducing expert knowledge and reducing the difficulty of task learning. Specifically, the knowledge-guided specific rules were used to transform the output part of the supervised dataset into a combination of post prompts and the output portion. PP-LLM method does not change the training and inference processes of the model, and does not add computational cost. Experimental results show that PP-LLM method significantly improves the accuracy of inference results and narrows the gap between model predictions and actual answers. Compared with the results without using the proposed method, the F1 value and Recall-Oriented Understudy for Gisting Evaluation (ROUGE) of the PP-LLM method have significantly improved. It can be seen that the above work improves the reliability of LLMs in professional applications and provides new ideas for LLM generation technology.

Key words: engineering drawing, Large Language Model (LLM), data augmentation, multi-modal, prompt

中图分类号: