《计算机应用》唯一官方网站 ›› 2025, Vol. 45 ›› Issue (3): 685-696.DOI: 10.11772/j.issn.1001-9081.2025010128
• 大模型前沿研究与典型应用 • 下一篇
收稿日期:
2025-02-10
修回日期:
2025-02-17
接受日期:
2025-02-19
发布日期:
2025-02-27
出版日期:
2025-03-10
通讯作者:
秦小林
作者简介:
古徐(1998—),男,四川成都人,博士研究生,主要研究方向:自然语言处理、行业大模型基金资助:
Xiaolin QIN1,2(), Xu GU1,2, Dicheng LI1,2, Haiwen XU3
Received:
2025-02-10
Revised:
2025-02-17
Accepted:
2025-02-19
Online:
2025-02-27
Published:
2025-03-10
Contact:
Xiaolin QIN
About author:
GU Xu, born in 1998, Ph. D. candidate. His research interests include natural language processing, industry-specific large language models.Supported by:
摘要:
大语言模型(LLM)是由具有大量参数(通常数十亿个权重或更多)的人工神经网络组成的一类语言模型,使用自监督学习或半监督学习对大量未标记文本进行训练,是当前生成式人工智能(AI)技术的核心。与传统语言模型相比,LLM通过大量的算力、参数和数据支持,展现出更强的语言理解与生成能力,广泛应用于机器翻译、问答系统、对话生成等众多任务中并表现卓越。现有的综述大多侧重于LLM的理论架构与训练方法,对LLM的产业级应用实践及技术生态演进的系统性探讨仍显不足。因此,在介绍LLM的基础架构、训练技术及发展历程的基础上,分析当前通用的LLM关键技术和以LLM为底座的先进融合技术。通过归纳总结现有研究,进一步阐述LLM在实际应用中面临的挑战,包括数据偏差、模型幻觉和计算资源消耗等问题,并对LLM的持续发展趋势进行展望。
中图分类号:
秦小林, 古徐, 李弟诚, 徐海文. 大语言模型综述与展望[J]. 计算机应用, 2025, 45(3): 685-696.
Xiaolin QIN, Xu GU, Dicheng LI, Haiwen XU. Survey and prospect of large language models[J]. Journal of Computer Applications, 2025, 45(3): 685-696.
架构 | 代表模型 | |
---|---|---|
编码器架构 | BERT[ | |
编码器-解码器 | Flan-UL2[ | |
解码器 架构 | 因果解码器 | GPT系列[ |
前缀解码器 | PaLM[ |
表1 现有LLM架构及模型
Tab. 1 Existing LLM architectures and models
架构 | 代表模型 | |
---|---|---|
编码器架构 | BERT[ | |
编码器-解码器 | Flan-UL2[ | |
解码器 架构 | 因果解码器 | GPT系列[ |
前缀解码器 | PaLM[ |
模型 | 不同领域的准确率/% | ELO分数 | |
---|---|---|---|
SWE-bench Verified | AIME | Codeforces | |
OpenAI o1-preview | 41.3 | 56.7 | 1 258 |
Open AI o1 | 48.9 | 83.3 | 1 891 |
Open AI o3 | 71.7 | 96.7 | 2 727 |
表2 OpenAI o1与OpenAI o3的性能对比结果
Tab. 2 Performance comparison results of OpenAI o1 and OpenAI o3
模型 | 不同领域的准确率/% | ELO分数 | |
---|---|---|---|
SWE-bench Verified | AIME | Codeforces | |
OpenAI o1-preview | 41.3 | 56.7 | 1 258 |
Open AI o1 | 48.9 | 83.3 | 1 891 |
Open AI o3 | 71.7 | 96.7 | 2 727 |
RAG框架 | 聚焦点 | 关键组件 | 应用场景 | 灵活性 | 集成难易度 | 开发者适用性 |
---|---|---|---|---|---|---|
Llamalndex | 语言模型与外部资源的深度集成 | 数据连接器、引擎、数据代理、索引管理等组件 | 语义搜索、文档索引等任务 | 专为数据索引与搜索优化,具备较高专注性 | 与数据源集成较为顺畅,数据源实时接入 | LLM数据增强,注重数据整理与搜索 |
LangChain | 索引构建与检索优化 | 检索系统、数据索引机制等 | 上下文感知查询、数据处理任务等 | 技术简单易于构建,对复杂任务的扩展性相对不足 | 与各种数据源和服务的兼容性强,能够灵活应对集成需求 | 构建知识问答系统、文档搜索引擎及其他海量数据处理场景 |
RAGflow | 智能文档解析和管理 | 智能文档解析系统、基于LLM的查询、文档上传与管理、多个处理模板 | 会计、人力资源、研究等行业应用的智能文档处理,基于证据的问答 | 高度定制化的文档处理,支持多种行业与角色的需求 | 易于集成文档上传和管理,支持多种文档格式及行业特定需求 | 开发智能文档处理系统并结合LLM查询 |
表3 不同的RAG框架对比
Tab. 3 Comparison of different RAG frameworks
RAG框架 | 聚焦点 | 关键组件 | 应用场景 | 灵活性 | 集成难易度 | 开发者适用性 |
---|---|---|---|---|---|---|
Llamalndex | 语言模型与外部资源的深度集成 | 数据连接器、引擎、数据代理、索引管理等组件 | 语义搜索、文档索引等任务 | 专为数据索引与搜索优化,具备较高专注性 | 与数据源集成较为顺畅,数据源实时接入 | LLM数据增强,注重数据整理与搜索 |
LangChain | 索引构建与检索优化 | 检索系统、数据索引机制等 | 上下文感知查询、数据处理任务等 | 技术简单易于构建,对复杂任务的扩展性相对不足 | 与各种数据源和服务的兼容性强,能够灵活应对集成需求 | 构建知识问答系统、文档搜索引擎及其他海量数据处理场景 |
RAGflow | 智能文档解析和管理 | 智能文档解析系统、基于LLM的查询、文档上传与管理、多个处理模板 | 会计、人力资源、研究等行业应用的智能文档处理,基于证据的问答 | 高度定制化的文档处理,支持多种行业与角色的需求 | 易于集成文档上传和管理,支持多种文档格式及行业特定需求 | 开发智能文档处理系统并结合LLM查询 |
1 | CHOMSKY N. Syntactic structures [M]. Berlin: De Gruyter Mouton, 1957: 8-29. |
2 | WEIZENBAUM J. ELIZA — a computer program for the study of natural language communication between man and machine [J]. Communications of the ACM, 1966, 9(1): 36-45. |
3 | MILLER G A. WordNet: a lexical database for English [J]. Communications of the ACM, 1995, 38(11): 39-41. |
4 | BRANTS T, POPAT A C, XU P, et al. Large language models in machine translation [C]// Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Stroudsburg: ACL, 2007: 858-867. |
5 | BERGER A L, DELLA PIETRA S A, DELLA PIETRA V J. A maximum entropy approach to natural language processing [J]. Computational Linguistics, 1996, 22(1): 39-71. |
6 | BENGIO Y, DUCHARME R, VINCENT P, et al. A neural probabilistic language model [J]. Journal of Machine Learning Research, 2003, 3: 1137-1155. |
7 | LIPTON Z C, BERKOWITZ J, ELKAN C. A critical review of recurrent neural networks for sequence learning [EB/OL]. [2024-08-01]. . |
8 | SCHMIDHUBER J, HOCHREITER S. Long short-term memory[J]. Neural Computation, 1997, 9(8): 1735-1780. |
9 | WU Y, SCHUSTER M, CHEN Z, et al. Google’s neural machine translation system: bridging the gap between human and machine translation [EB/OL]. [2024-03-02]. . |
10 | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need [C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2017: 6000-6010. |
11 | ZHOU Y, MURESANU A I, HAN Z, et al. Large language models are human-level prompt engineers [EB/OL]. [2024-08-23]. . |
12 | DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional Transformers for language understanding [C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Stroudsburg: ACL, 2019: 4171-4186. |
13 | RADFORD A, NARASIMHAN K, SALIMANS T, et al. Improving language understanding by generative pre-training [EB/OL]. [2024-10-01]. . |
14 | RADFORD A, WU J, CHILD R, et al. Language models are unsupervised multitask learners [EB/OL]. [2024-01-13]. . |
15 | BROWN T B, MANN B, RYDER N, et al. Language models are few-shot learners [C]// Proceedings of the 34th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2020: 1877-1901. |
16 | HUANG J, CHANG K C C. Towards reasoning in large language models: a survey [C]// Findings of the Association for Computational Linguistics: ACL 2023. Stroudsburg: ACL, 2023: 1049-1065. |
17 | WENG Y, ZHU M, XIA F, et al. Large language models are better reasoners with self-verification [C]// Findings of the Association for Computational Linguistics: EMNLP 2023. Stroudsburg: ACL, 2023: 2550-2575. |
18 | PENG K, DING L, ZHONG Q, et al. Towards making the most of ChatGPT for machine translation [C]// Findings of the Association for Computational Linguistics: EMNLP 2023. Stroudsburg: ACL, 2023: 5622-5633. |
19 | SHAILENDRA P, GHOSH R C, KUMAR R, et al. Survey of large language models for answering questions across various fields[C]// Proceedings of the 10th International Conference on Advanced Computing and Communication Systems. Piscataway: IEEE, 2024: 520-527. |
20 | GU X, CHEN X, LU P, et al. AGCVT-prompt for sentiment classification: automatically generating chain of thought and verbalizer in prompt learning [J]. Engineering Applications of Artificial Intelligence, 2024, 132: No.107907. |
21 | OpenAI. GPT-4 technical report [R/OL]. [2024-07-12].. |
22 | LIU P, YUAN W, FU J, et al. Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing[J]. ACM Computing Surveys, 2023, 55(9): No.195. |
23 | 张钦彤,王昱超,王鹤羲,等. 大语言模型微调技术的研究综述[J]. 计算机工程与应用, 2024, 60(17):17-33. |
ZHANG Q T, WANG Y C, WANG H X, et al. Comprehensive review of large language model fine-tuning [J]. Computer Engineering and Applications, 2024, 60(17): 17-33. | |
24 | 罗锦钊,孙玉龙,钱增志,等. 人工智能大模型综述及展望 [J]. 无线电工程, 2023, 53(11): 2461-2472. |
LUO J Z, SUN Y L, QIAN Z Z, et al. Overview and prospect of artificial intelligence large models [J]. Radio Engineering, 2023, 53(11): 2461-2472. | |
25 | 张俊,徐箭,许沛东,等. 人工智能大模型在电力系统运行控制中的应用综述及展望[J]. 武汉大学学报(工学版), 2023, 56(11):1368-1379. |
ZHANG J, XU J, XU P D, et al. Review and prospect of application of artificial intelligence large model in power system operation control [J]. Engineering Journal of Wuhan University, 2023, 56(11): 1368-1379. | |
26 | 刘安平,金昕,胡国强. 人工智能大模型综述及金融应用展望[J]. 人工智能, 2023(2):29-40. |
LIU A P, JIN X, HU G Q. Overview of large models of artificial Intelligence and their prospects of financial applications [J]. Artificial Intelligence View, 2023(2): 29-40. | |
27 | Team Qwen. Qwen2.5 technical report[R/OL]. [2025-01-13]. . |
28 | DeepSeek-AI. DeepSeek-V3 technical report [R/OL]. [2025-02-19]. . |
29 | JIANG A Q, SABLAYROLLES A, ROUX A, et al. Mixtral of experts [EB/OL]. [2024-10-24]. . |
30 | JIAO X, YIN Y, SHANG L, et al. TinyBERT: distilling BERT for natural language understanding [C]// Findings of the Association for Computational Linguistics: EMNLP 2020. Stroudsburg: ACL, 2020: 4163-4174. |
31 | BAI H, ZHANG W, HOU L, et al. BinaryBERT: pushing the limit of BERT quantization [C]// Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Stroudsburg: ACL, 2021: 4334-4348. |
32 | LAN Z, CHEN M, GOODMAN S, et al. ALBERT: a lite BERT for self-supervised learning of language representations [EB/OL]. [2024-04-23]. . |
33 | SHAZEER N. GLU variants improve transformer[EB/OL]. [2024-03-14]. . |
34 | WARNER B, CHAFFIN A, CLAVIÉ B, et al. Smarter, better, faster, longer: a modern bidirectional encoder for fast, memory efficient, and long context finetuning and inference [EB/OL]. [2024-12-25]. . |
35 | SU J, AHMED M, LU Y, et al. Roformer: enhanced Transformer with rotary position embedding [J]. Neurocomputing, 2024, 568: No.127063. |
36 | DAO T. FlashAttention-2: faster attention with better parallelism and work partitioning [EB/OL]. [2024-11-07]. . |
37 | TAY Y, DEHGHANI M, TRAN V Q, et al. UL2: unifying language learning paradigms [EB/OL]. [2024-06-03]. . |
38 | RAFFEL C, SHAZEER N, ROBERTS A, et al. Exploring the limits of transfer learning with a unified text-to-text Transformer[J]. Journal of Machine Learning Research, 2020, 21: 1-67. |
39 | LEWIS M, LIU Y, GOYAL N, et al. BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension [C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2020: 7871-7880. |
40 | XUE L, CONSTANT N, ROBERTS A, et al. mT5: a massively multilingual pre-trained Text-to-Text Transformer [C]// Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics. Stroudsburg: ACL, 2021: 483-498. |
41 | Workshop BigScience. BLOOM: a 176B-parameter open-access multilingual language model [EB/OL]. [2024-09-04]. . |
42 | ZHANG S, ROLLER S, GOYAL N, et al. OPT: open pre-trained Transformer language models [EB/OL]. [2024-05-01]. . |
43 | RAE J W, BORGEAUD S, CAI T, et al. Scaling language models: methods, analysis & insights from training gopher [EB/OL]. [2024-11-03]. . |
44 | TOUVRON H, LAVRIL T, IZACARD G, et al. LLaMA: open and efficient foundation language models [EB/OL]. [2024-12-22]. . |
45 | Team GLM. ChatGLM: a family of large language models from GLM-130B to GLM-4 all tools [EB/OL]. [2024-09-13]. . |
46 | YANG A, XIAO B, WANG B, et al. Baichuan 2: open large-scale language models [EB/OL]. [2024-02-01]. . |
47 | CHOWDHERY A, NARANG S, DEVLIN J, et al. PaLM: scaling language modeling with pathways [J]. Journal of Machine Learning Research, 2023, 24: 1-113. |
48 | DU Z, QIAN Y, LIU X, et al. GLM: general language model pretraining with autoregressive blank infilling [C]// Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg: ACL, 2022: 320-335. |
49 | KINGMA D P, BA L J. Adam: a method for stochastic optimization [EB/OL]. [2024-05-05]. . |
50 | LOSHCHILOV I, HUTTER F. Decoupled weight decay regularization [EB/OL]. [2023-11-01]. . |
51 | SHAZEER N, STERN M. Adafactor: adaptive learning rates with sublinear memory cost [C]// Proceedings of the 35th International Conference on Machine Learning. New York: JMLR.org, 2018: 4596-4604. |
52 | ZHENG Q, XIA X, ZOU X, et al. CodeGeeX: a pre-trained model for code generation with multilingual benchmarking on HumanEval-X [C]// Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. New York: ACM, 2023: 5673-5684. |
53 | RASLEY J, RAJBHANDARI S, RUWASE O, et al. DeepSpeed: system optimizations enable training deep learning models with over 100 billion parameters [C]// Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2020: 3505-3506. |
54 | MICIKEVICIUS P, NARANG S, ALBEN J, et al. Mixed precision training [EB/OL]. [2024-08-17]. . |
55 | DETTMERS T, LEWIS M, BELKADA Y, et al. LLM.int8(): 8-bit matrix multiplication for transformers at scale [C]// Proceedings of the 36th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2022: 30318-30332. |
56 | MiniMax. MiniMax-01: scaling foundation models with lightning attention [EB/OL]. [2025-02-11]. . |
57 | WEI J, TAY Y, BOMMASANI R, et al. Emergent abilities of large language models [EB/OL]. [2024-03-23]. . |
58 | TAO C, HOU L, ZHANG W, et al. Compression of generative pre-trained language models via quantization [C]// Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg: ACL, 2022: 4821-4836. |
59 | JACOBS R A, JORDAN M I, NOWLAN S J, et al. Adaptive mixtures of local experts [J]. Neural Computation, 1991, 3(1): 79-87. |
60 | FEDUS W, ZOPH B, SHAZEER N. Switch Transformers: scaling to trillion parameter models with simple and efficient sparsity [J]. Journal of Machine Learning Research, 2022, 23: 1-39. |
61 | SCHULMAN J, WOLSKI F, DHARIWAL P, et al. Proximal policy optimization algorithms [EB/OL]. [2024-06-01]. . |
62 | STIENNON N, OUYANG L, WU J, et al. Learning to summarize from human feedback [C]// Proceedings of the 34th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2020: 3008-3021. |
63 | LIU X, ZHENG Y, DU Z, et al. GPT understands, too [J]. AI Open, 2024, 5: 208-215. |
64 | SCHICK T, SCHÜTZE H. Exploiting cloze-questions for few-shot text classification and natural language inference [C]// Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume. Stroudsburg: ACL, 2021: 255-269. |
65 | LI X L, LIANG P. Prefix-tuning: optimizing continuous prompts for generation [C]// Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Stroudsburg: ACL, 2021: 4582-4597. |
66 | KAPLAN J, McCANDLISH S, HENIGHAN T, et al. Scaling laws for neural language models [EB/OL]. [2024-05-14]. . |
67 | HOFFMANN J, BORGEAUD S, MENSCH A, et al. Training compute-optimal large language models [C]// Proceedings of the 36th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2022: 30016-30030. |
68 | DeepSeek-AI. DeepSeek-R1: incentivizing reasoning capability in LLMs via reinforcement learning [EB/OL]. [2025-02-07]. . |
69 | ROMAN R S, ADI Y, DELEFORGE A, et al. From discrete tokens to high-fidelity audio using multi-band diffusion [C]// Proceedings of the 37th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2023: 1526-1538. |
70 | ZIV A, GAT I, LE LAN G, et al. Masked audio generation using a single non-autoregressive Transformer [EB/OL]. [2024-09-01].. |
71 | SCHNEIDER S, BAEVSKI A, COLLOBERT R, et al. Wav2Vec: unsupervised pre-training for speech recognition [C]// Proceedings of the INTERSPEECH 2019. [S.l.]: International Speech Communication Association, 2019: 3465-3469. |
72 | CHUNG Y A, ZHANG Y, HAN W, et al. w2v-BERT: combining contrastive learning and masked language modeling for self-supervised speech pre-training [C]// Proceedings of the 2021 IEEE Automatic Speech Recognition and Understanding Workshop. Piscataway: IEEE, 2021: 244-250. |
73 | JIN M, WEN Q, LIANG Y, et al. Large models for time series and spatio-temporal data: a survey and outlook [EB/OL]. [2024-01-24]. . |
74 | YANG C H H, TSAI Y Y, CHEN P Y. Voice2Series: reprogramming acoustic models for time series classification [C]// Proceedings of the 38th International Conference on Machine Learning. JMLR.org, 2021: 11808-11819. |
75 | YUE Z, WANG Y, DUAN J, et al. TS2Vec: towards universal representation of time series [C]// Proceedings of the 36th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2022: 8980-8987. |
76 | VAYYAT M, KASI J, BHATTACHARYA A, et al. CLUDA: contrastive learning in unsupervised domain adaptation for semantic segmentation [EB/OL]. [2024-11-21]. . |
77 | BROOKS T, PEEBLES B, HOLMES C, et al. Video generation models as world simulators [EB/OL]. [2024-07-21]. . |
78 | DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: Transformers for image recognition at scale[EB/OL]. [2024-08-11]. . |
79 | RADFORD A, KIM J W, HALLACY C, et al. Learning transferable visual models from natural language supervision [C]// Proceedings of the 38th International Conference on Machine Learning. New York: JMLR.org, 2021: 8748-8763. |
80 | ZHAI X, MUSTAFA B, KOLESNIKOV A, et al. Sigmoid loss for language image pre-training [C]// Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2023: 11941-11952. |
81 | CHEN X, WANG X, BEYER L, et al. PaLI-3 vision language models: smaller, faster, stronger [EB/OL]. [2024-08-24].. |
82 | XU L, LI A, ZHU L, et al. SuperCLUE: a comprehensive Chinese large language model benchmark [EB/OL]. [2025-01-08]. . |
83 | Team Kimi. Kimi k 1.5: scaling reinforcement learning with LLMs[R/OL]. [2025-02-12]. . |
84 | SANH V, DEBUT L, CHAUMOND J, et al. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter[EB/OL]. [2024-08-23]. . |
85 | TAMBE T, HOOPER C, PENTECOST L, et al. EdgeBERT: sentence-level energy optimizations for latency-aware multi-task NLP inference [C]// Proceedings of the 54th Annual IEEE/ACM International Symposium on Microarchitecture. New York: ACM, 2021: 830-844. |
86 | GULATI A, QIN J, CHIU C C, et al. Conformer: convolution-augmented Transformer for speech recognition [C]// Proceedings of the INTERSPEECH 2020. [S.l.]: International Speech Communication Association, 2020: 5036-5040. |
87 | TAN M, LE Q V. EfficientNet: rethinking model scaling for convolutional neural networks [C]// Proceedings of the 36th International Conference on Machine Learning. New York: JMLR.org, 2019: 6105-6114. |
88 | WANG L, MA C, FENG X, et al. A survey on large language model based autonomous agents [J]. Frontiers of Computer Science, 2024, 18(6): No.186345. |
89 | XI Z, CHEN W, GUO X, et al. The rise and potential of large language model based agents: a survey [J]. SCIENCE CHINA Information Sciences, 2025, 68(2): No.121101. |
90 | LI L, XIAO J, CHEN G, et al. Zero-shot visual relation detection via composite visual cues from large language models [C]// Proceedings of the 37th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2023: 50105-50116. |
91 | LI H, HAO Y, ZHAI Y, et al. The Hitchhiker’s guide to program analysis: a journey with large language models [EB/OL]. [2024-09-06]. . |
92 | LIN B Y, FU Y, YANG K, et al. SwiftSage: a generative agent with fast and slow thinking for complex interactive tasks [C]// Proceedings of the 37th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2023: 23813-23825. |
93 | MANDI Z, JAIN S, SONG S. RoCo: dialectic multi-robot collaboration with large language models [C]// Proceedings of the 2024 IEEE International Conference on Robotics and Automation. Piscataway: IEEE, 2024: 286-299. |
94 | ZHANG C, YANG K, HU S, et al. ProAgent: building proactive cooperative agents with large language models [C]// Proceedings of the 38th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2024: 17591-17599. |
95 | KANG Y, KIM J. ChatMOF: an autonomous AI system for predicting and generating metal-organic frameworks [J]. Nature Communications, 2024, 15: No.4705. |
96 | CHEN P L, CHANG C S. InterAct: exploring the potentials of ChatGPT as a cooperative agent [EB/OL]. [2024-03-11]. . |
97 | SHI J, ZHAO J, WANG Y, et al. CGMI: configurable general multi-agent interaction framework [EB/OL]. [2024-02-21].. |
98 | ZHOU W, JIANG Y E, LI L, et al. Agents: an open-source framework for autonomous language agents [EB/OL]. [2024-01-22]. . |
99 | LIU Y, CHEN W, BAI Y, et al. Aligning cyber space with physical world: a comprehensive survey on embodied AI [EB/OL]. [2024-12-01]. . |
100 | DUAN J, YU S, TAN H L, et al. A survey of embodied AI: from simulators to research tasks [J]. IEEE Transactions on Emerging Topics in Computational Intelligence, 2022, 6(2): 230-244. |
101 | PFEIFER R, BONGARD J. How the body shapes the way we think: a new view of intelligence [M]. Cambridge: MIT Press, 2006. |
102 | ZHU Z, MA X, CHEN Y, et al. 3D-VisTA: pre-trained Transformer for 3D vision and text alignment [C]// Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2023: 2899-2909. |
103 | YANG Y, SUN F Y, WEIHS L, et al. Holodeck: language guided generation of 3D embodied AI environments [C]// Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2024: 16277-16287. |
104 | YAN C, QU D, XU D, et al. GS-SLAM: dense visual SLAM with 3D Gaussian splatting [C]// Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2024: 19595-19604. |
105 | WU X, JIANG L, WANG P S, et al. Point Transformer V3: simpler faster stronger [C]// Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2024: 4840-4851. |
106 | CAI W, HUANG S, CHENG G, et al. Bridging zero-shot object navigation and foundation models through pixel-guided navigation skill [C]// Proceedings of the 2024 IEEE International Conference on Robotics and Automation. Piscataway: IEEE, 2024: 5228-5234. |
107 | WU Y, CHENG X, ZHANG R, et al. EDA: explicit text-decoupling and dense alignment for 3D visual grounding [C]// Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2023: 19231-19242. |
108 | WANG T, MAO X, ZHU C, et al. EmbodiedScan: a holistic multi-modal 3D perception suite towards embodied AI [C]// Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2024: 19757-19767. |
109 | GE Y, TANG Y, XU J, et al. BEHAVIOR Vision Suite: customizable dataset generation via simulation [C]// Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2024: 22401-22412. |
110 | GAN C, SCHWARTZ J, ALTER S, et al. ThreeDWorld: a platform for interactive multi-modal physical simulation [EB/OL]. [2024-03-23]. . |
111 | CHEN J, LIN B, XU R, et al. MapGPT: map-guided prompting with adaptive path planning for vision-and-language navigation[C]// Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg: ACL, 2024: 9796-9810. |
[1] | 王靖, 方旭明. Wi-Fi7多链路通感一体化的功率和信道联合智能分配算法[J]. 《计算机应用》唯一官方网站, 2025, 45(2): 563-570. |
[2] | 王雅伦, 张仰森, 朱思文. 面向知识推理的位置编码标题生成模型[J]. 《计算机应用》唯一官方网站, 2025, 45(2): 345-353. |
[3] | 吕学强, 王涛, 游新冬, 徐戈. 层次融合多元知识的命名实体识别框架——HTLR[J]. 《计算机应用》唯一官方网站, 2025, 45(1): 40-47. |
[4] | 帅奇, 王海瑞, 朱贵富. 基于双向对比训练的中文故事结尾生成模型[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2683-2688. |
[5] | 张全梅, 黄润萍, 滕飞, 张海波, 周南. 融合异构信息的自动国际疾病分类编码方法[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2476-2482. |
[6] | 陆潜慧, 张羽, 王梦灵, 吴庭伟, 单玉忠. 基于改进循环池化网络的核电装备质量文本分类模型[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2034-2040. |
[7] | 刘耀, 李雨萌, 宋苗苗. 基于业务流程的认知图谱[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1699-1705. |
[8] | 于右任, 张仰森, 蒋玉茹, 黄改娟. 融合多粒度语言知识与层级信息的中文命名实体识别模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1706-1712. |
[9] | 徐月梅, 胡玲, 赵佳艺, 杜宛泽, 王文清. 大语言模型的技术应用前景与风险挑战[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1655-1662. |
[10] | 刘晓芳, 张军. 概率驱动的动态多目标多智能体协同调度进化优化[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1372-1377. |
[11] | 高龙涛, 李娜娜. 基于方面感知注意力增强的方面情感三元组抽取[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1049-1057. |
[12] | 杨先凤, 汤依磊, 李自强. 基于交替注意力机制和图卷积网络的方面级情感分析模型[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1058-1064. |
[13] | 杨保山, 杨智, 陈性元, 韩冰, 杜学绘. Android应用敏感行为与隐私政策一致性分析[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 788-796. |
[14] | 唐朝君, 夏梅妍, 张华, 谢挺. 动态事件触发下多智能体系统固定时间一致性[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 960-965. |
[15] | 王楷天, 叶青, 程春雷. 基于异构图表示的中医电子病历分类方法[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 411-417. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||