Journal of Computer Applications ›› 2025, Vol. 45 ›› Issue (3): 685-696.DOI: 10.11772/j.issn.1001-9081.2025010128

• Frontier research and typical applications of large models •     Next Articles

Survey and prospect of large language models

Xiaolin QIN1,2(), Xu GU1,2, Dicheng LI1,2, Haiwen XU3   

  1. 1.Chengdu Institute of Computer Application,Chinese Academy of Sciences,Chengdu Sichuan 610213,China
    2.School of Computer Science and Technology,University of Chinese Academy of Sciences,Beijing 100049,China
    3.Faculty of Science,Civil Aviation Flight University of China,Guanghan Sichuan 618307,China
  • Received:2025-02-10 Revised:2025-02-17 Accepted:2025-02-19 Online:2025-02-27 Published:2025-03-10
  • Contact: Xiaolin QIN
  • About author:GU Xu, born in 1998, Ph. D. candidate. His research interests include natural language processing, industry-specific large language models.
    LI Dicheng, born in 2003, M. S. candidate. His research interests include natural language processing, industry-specific large language models.
    XU Haiwen, born in 1978, Ph. D., professor. His research interests include optimization theory and algorithms, transportation planning and management.
  • Supported by:
    National Key Research and Development Program of China(2023YFB3308601);Sichuan Science and Technology Program(2024NSFJQ0035);Talents by Sichuan Provincial Party Committee Organization Department

大语言模型综述与展望

秦小林1,2(), 古徐1,2, 李弟诚1,2, 徐海文3   

  1. 1.中国科学院 成都计算机应用研究所,成都 610213
    2.中国科学院大学 计算机科学与技术学院,北京 100049
    3.中国民用航空飞行学院 理学院,四川 广汉 618307
  • 通讯作者: 秦小林
  • 作者简介:古徐(1998—),男,四川成都人,博士研究生,主要研究方向:自然语言处理、行业大模型
    李弟诚(2003—),男,安徽六安人,硕士研究生,主要研究方向:自然语言处理、行业大模型
    徐海文(1978—),男,山东菏泽人,教授,博士,主要研究方向:优化理论与算法、交通运输规划与管理。
  • 基金资助:
    国家重点研发计划项目(2023YFB3308601);四川省科技计划项目(2024NSFJQ0035);四川省委组织部人才专项

Abstract:

Large Language Models (LLMs) are a class of language models composed of artificial neural networks with a vast number of parameters (typically billions of weights or more). They are trained on a large amount of unlabeled text using self-supervised or semi-supervised learning and are the core of current generative Artificial Intelligence (AI) technologies. Compared to traditional language models, LLMs demonstrate stronger language understanding and generation capabilities, supported by substantial computational power, extensive parameters, and large-scale data. They are widely applied in tasks such as machine translation, question answering systems, and dialogue generation with good performance. Most of the existing surveys focus on the theoretical construction and training techniques of LLMs, while systematic exploration of LLMs’ industry-level application practices and evolution of the technological ecosystem remains insufficient. Therefore, based on introducing the foundational architecture, training techniques, and development history of LLMs, the current general key technologies in LLMs and advanced integration technologies with LLMs bases were analyzed. Then, by summarizing the existing research, challenges faced by LLMs in practical applications were further elaborated, including problems such as data bias, model hallucination, and computational resource consumption, and an outlook was provided on the ongoing development trends of LLMs.

Key words: Large Language Model (LLM), Agent, Natural Language Processing (NLP), Retrieval-Augmented Generation (RAG), model hallucination

摘要:

大语言模型(LLM)是由具有大量参数(通常数十亿个权重或更多)的人工神经网络组成的一类语言模型,使用自监督学习或半监督学习对大量未标记文本进行训练,是当前生成式人工智能(AI)技术的核心。与传统语言模型相比,LLM通过大量的算力、参数和数据支持,展现出更强的语言理解与生成能力,广泛应用于机器翻译、问答系统、对话生成等众多任务中并表现卓越。现有的综述大多侧重于LLM的理论架构与训练方法,对LLM的产业级应用实践及技术生态演进的系统性探讨仍显不足。因此,在介绍LLM的基础架构、训练技术及发展历程的基础上,分析当前通用的LLM关键技术和以LLM为底座的先进融合技术。通过归纳总结现有研究,进一步阐述LLM在实际应用中面临的挑战,包括数据偏差、模型幻觉和计算资源消耗等问题,并对LLM的持续发展趋势进行展望。

关键词: 大语言模型, 智能体, 自然语言处理, 检索增强生成, 模型幻觉

CLC Number: