Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (S2): 1-8.DOI: 10.11772/j.issn.1001-9081.2023040428

• Artificial intelligence •    

Research progress and enlightenment of large language models on multi-lingual intelligence

Yuemei XU1(), Ling HU1, Jiayi ZHAO1, Wanze DU2, Wenqing WANG2   

  1. 1.School of Information Science and Technology,Beijing Foreign Studies University,Beijing 100089,China
    2.College of Software and Microelectronics,Peking University,Beijing 100053,China
  • Received:2023-04-20 Revised:2023-08-17 Accepted:2023-08-21 Online:2024-01-09 Published:2023-12-31
  • Contact: Yuemei XU

大语言模型与多语言智能的研究进展与启示

徐月梅1(), 胡玲1, 赵佳艺1, 杜宛泽2, 王文清2   

  1. 1.北京外国语大学 信息科学技术学院,北京 100089
    2.北京大学 软件与微电子学院,北京 100053
  • 通讯作者: 徐月梅
  • 作者简介:徐月梅(1985—),女,广西梧州人,副教授,博士,主要研究方向:跨语言自然语言处理
    胡玲(2000—),女,江西南昌人,硕士研究生,主要研究方向:自然语言处理
    赵佳艺(2002—),女,陕西渭南人,主要研究方向:自然语言处理
    杜宛泽(2000—),女,北京人,硕士研究生,主要研究方向:自然语言处理
    王文清(2000—),女,河南商丘人,硕士研究生,主要研究方向:自然语言处理。
  • 基金资助:
    中央高校基本科研业务费专项(2022JJ006)

Abstract:

In view of the fact that the Large Language Model (LLM) performs well on high-resource languages but poorly on low-resource languages, a comprehensive analysis was conducted on the research status, techniques, and limitations of LLMs in multi-lingual scenarios. Firstly, representative language models such as Multi-BERT (multi-lingual Bidirectional Encoder Representations from Transformer), GPT (Generative Pre-trained Transformer) and ChatGPT (Chat Generative Pre-trained Transformer) since 2018 were reviewed to trace the development of LLMs. Then, a detailed analysis of LLM in multi-lingual intelligence was conducted, summarizing the research limitations and improvement directions of LLM in multi-lingual intelligence. Finally, the future application scenarios of multi-lingual intelligence for LLM were discussed. The analysis indicates that existing LLMs are limited by imbalanced multi-lingual training data, so that they have ethical biases across different languages, suffer from monotonous language style, lack evaluation benchmarks for multi-lingual capabilities, and suffer from hallucination problem. To enhance multi-lingual performance of LLM, future developments will rely on joint training within the same language family, multi-lingual adapter technology, cross-lingual transfer learning technology, prompt engineering technology and reinforcement learning technology based on artificial intelligence feedback.

Key words: large language model, multi-lingual intelligence, cross-lingual model, artificial general intelligence, transfer learning

摘要:

针对大语言模型(LLM)在高资源语言上表现优异而在低资源语言上表现欠佳的现状,深入分析LLM在多语言场景下的研究现状、技术与局限。首先,从2018年至今以Multi-BERT(multi-lingual Bidirectional Encoder Representations from Transformers)、GPT(Generative Pre-trained Transformer)和ChatGPT(Chat Generative Pre-trained Transformer)等语言模型为代表,综述LLM的发展脉络;然后,具体分析了大语言模型在多语言智能上的探索,总结现有LLM在多语言智能的研究局限及其改进方向;最后,探讨LLM未来的多语言智能应用场景。分析指出现有LLM受限于多语言训练语料不均衡,存在语言文化的伦理偏见、语言模型的风格趋同化、多语言能力评估基准缺乏以及多语言场景下的模型幻象输出等问题,未来可采用同一语系家族语言的联合训练、多语言适配器技术、跨语言迁移学习技术、提示语工程技术、基于人工智能反馈的强化学习技术等策略实现多语言智能的LLM。

关键词: 大语言模型, 多语言智能, 跨语言模型, 通用人工智能, 迁移学习

CLC Number: