Journal of Computer Applications ›› 2025, Vol. 45 ›› Issue (3): 725-731.DOI: 10.11772/j.issn.1001-9081.2024111598

• Frontier research and typical applications of large models • Previous Articles     Next Articles

Efficient fine-tuning method of large language models for test case generation

Peng CAO1(), Guangqi WEN1, Jinzhu YANG1, Gang CHEN2, Xinyi LIU2, Xuechun JI3   

  1. 1.School of Computer Science and Engineering,Northeastern University,Shenyang Liaoning 110169,China
    2.State Grid Information and Telecommunication Company Limited,Beijing 102200,China
    3.State Grid Electric Power Research Institute Company Limited,Nanjing Jiangsu 211106,China
  • Received:2024-11-11 Revised:2025-01-16 Accepted:2025-01-17 Online:2025-01-21 Published:2025-03-10
  • Contact: Peng CAO
  • About author:WEN Guangqi, born in 1998, Ph. D. candidate. His research interests include machine learning, smart healthcare.
    YANG Jinzhu, born in 1979, Ph. D., professor. His research interests include artificial intelligence, image processing and analysis, medical image reconstruction and optimization.
    CHEN Gang, born in 1985, engineer. His research interests include software architecture, software project management.
    LIU Xinyi, born in 1982, M. S., senior engineer. Her research interests include machine learning.
    JI Xuechun, born in 1977, M. S., professorate senior engineer. His research interests include artificial intelligence, power system automation.
  • Supported by:
    State Grid Corporation Technology Project(5108-202340436A-3-2-ZN)

面向测试用例生成的大模型高效微调方法

曹鹏1(), 温广琪1, 杨金柱1, 陈刚2, 刘歆一2, 季学纯3   

  1. 1.东北大学 计算机科学与工程学院,沈阳 110169
    2.国网信息通信产业集团有限公司,北京 102200
    3.国网电力科学研究院有限公司,南京 211106
  • 通讯作者: 曹鹏
  • 作者简介:温广琪(1998—),男,山东德州人,博士研究生,主要研究方向:机器学习、智慧医疗
    杨金柱(1979—),男,辽宁沈阳人,教授,博士生导师,博士,CCF高级会员,主要研究方向:人工智能、影像处理与分析、医学影像重建与优化
    陈刚(1985—),男,北京人,工程师,主要研究方向:软件体系结构、软件项目管理
    刘歆一(1982—),女,北京人,高级工程师,硕士,主要研究方向:机器学习
    季学纯(1977—),男,江苏扬州人,研究员级高级工程师,硕士,主要研究方向:人工智能、电力系统自动化。
  • 基金资助:
    国家电网有限公司总部科技项目(5108-202340436A-3-2-ZN)

Abstract:

Data-driven automated generation technology of unit test cases has problems of low coverage and poor readability, struggling to meet the increasing demand for testing. Recently, Large Language Model (LLM) has shown great potential in code generation tasks. However, due to the differences in functional and coding styles of code data, LLMs face the challenges of catastrophic forgetting and resource constraints. To address these problems, a transfer learning idea was proposed by fine-tuning coding and functional styles simultaneously, and an efficient fine-tuning training method was developed for LLMs in generating unit test cases. Firstly, the widely used instruction datasets were adopted to align LLM with instructions, and the instruction sets were divided by task types. At the same time, the weight increments with task-specific features were extracted and stored. Secondly, an adaptive style extraction module was designed for dealing with various coding styles with noise-resistant learning and coding style backtracking learning in the module. Finally, joint training of the functional and coding style increments was performed respectively on the target domain, thereby realizing efficient adaptation and fine-tuning on the target domains with limited resources. Experimental results of test case generation on SF110 Corpus of Classes dataset indicate that the proposed method outperforms the methods for comparison. Compared to the mainstream code generation LLMs — Codex, Code Llama and DeepSeek-Coder, the proposed method has the compilation rate increased by 0.8%, 43.5% and 33.8%, respectively; the branch coverage increased by 3.1%, 1.0%, and 17.2% respectively; and the line coverage increased by 4.1%, 6.5%, and 15.5% respectively; verifying the superiority of the proposed method in code generation tasks.

Key words: unit test, code generation, Large Language Model (LLM), weight incremental learning, fine-tuning learning

摘要:

基于数据驱动的单元测试代码自动化生成技术存在覆盖率低和可读性差的问题,难以应对日益增长的测试需求。大语言模型(LLM)在代码生成任务中显示了极大的潜力,然而由于代码数据的功能风格和编码风格的差异,LLM面临灾难性遗忘和资源受限这2个挑战。为了解决这些问题,提出将编码风格和功能风格同步迁移微调的思想,并开发一种高效的LLM微调训练方法用于单元测试用例生成。首先,利用广泛使用的指令数据集对LLM进行指令对齐,并按任务类型对指令集分类;同时,提取并存储具有任务特征的权重增量;其次,设计一个自适应风格提取模块,该模块包含抗噪声干扰学习和编码风格回溯学习,以应对不同的代码编写风格;最后,在目标域分别对功能风格增量和编码风格增量进行联合训练,以实现在目标域低资源情况下的高效适配和微调。在SF110 Corpus of Classes数据集上的测试用例生成实验结果表明,所提方法的结果均优于对比方法,与主流代码生成LLM Codex、Code Llama和DeepSeek-Coder相比,所提方法的编译率分别提高0.8%、43.5%和33.8%、分支覆盖率分别提高3.1%、1.0%和17.2%;行覆盖率分别提高4.1%、6.5%和15.5%,验证了所提方法在代码生成任务上的优越性。

关键词: 单元测试, 代码生成, 大语言模型, 权重增量学习, 微调学习

CLC Number: