《计算机应用》唯一官方网站 ›› 2024, Vol. 44 ›› Issue (1): 101-112.DOI: 10.11772/j.issn.1001-9081.2023010080

• 人工智能 • 上一篇    

中文文本纠错软件测试用例的选择生成方法

冯程皓1, 谢振平1,2(), 丁博文1   

  1. 1.江南大学 人工智能与计算机学院, 江苏 无锡 214000
    2.江苏省媒体设计与软件技术重点实验室(江南大学), 江苏 无锡 214000
  • 收稿日期:2023-02-06 修回日期:2023-03-28 接受日期:2023-03-29 发布日期:2023-06-06 出版日期:2024-01-10
  • 通讯作者: 谢振平
  • 作者简介:冯程皓(1997—),男,河南焦作人,硕士研究生,主要研究方向:智能系统软件;
    丁博文(1996—),男,河南商丘人,硕士研究生,主要研究方向:进化算法。
    第一联系人:谢振平(1979—),男,江苏常州人,教授,博士,CCF会员,主要研究方向:知识计算与认知学习;
  • 基金资助:
    国家自然科学基金资助项目(61872166);江苏省“六大人才高峰”项目(XYDXX-161)

Selective generation method of test cases for Chinese text error correction software

Chenghao FENG1, Zhenping XIE1,2(), Bowen DING1   

  1. 1.College of Artificial Intelligence and Computer Science,Jiangnan University,Wuxi Jiangsu 214000,China
    2.Jiangsu Key Laboratory of Media Design and Software Technology (Jiangnan University),Wuxi Jiangsu 214000,China
  • Received:2023-02-06 Revised:2023-03-28 Accepted:2023-03-29 Online:2023-06-06 Published:2024-01-10
  • Contact: Zhenping XIE
  • About author:FENG Chenghao, born in 1997, M. S. candidate. His research interests include intelligent system software.
    DING Bowen, born in 1996, M. S. candidate. His research interests include evolutionary algorithms.
  • Supported by:
    National Natural Science Foundation of China(61872166);Jiangsu Provincial “Six Talented Peaks” Project(XYDXX-161)

摘要:

针对目前尚无有效的中文文本纠错软件测试用例生成方法的情况,为了服务于软件纠错性能的测量并为软件提供优化方向,设计了一种面向多用户的、工程化的中文文本纠错软件测试用例选择生成方法(SGMT-CCS)。定义了两种不同的可供用户选择的用例评判标准:错误数量密度和错误种类密度。设计了三个模块:测试用例自动化生成模块、测试用例选择模块以及测试用例优先级排序模块。在SGMT-CCS中,用户可以:1)在测试用例自动化生成的过程中自定义错误最小间隔和用例集大小;2)在测试用例选择的过程中自定义错误最小间隔和期望值;3)在测试用例选择和优先级排序的过程中选择不同的用例评判标准进行自定义操作,以适应不同数据集的要求。实验结果表明,SGMT-CCS能够在较短的时间内获得有效的测试用例,选择模块实验在模拟的需求情况下都能满足用户自定义目标,优先级排序模块实验验证了相较于排序前,在不同评判标准下的不同时间段内都能有效提高测试效率。

关键词: 测试用例生成, 中文文本纠错, 可选择生成, 回归测试, 自然语言处理

Abstract:

To address the lack of an effective method for generating test cases for Chinese text error correction software, and to measure and optimize the correction performance of software, a multi-user engineering-oriented method was designed, called Selective Generation Method of Test cases for Chinese text error Correction Software (SGMT-CCS). Two different criteria were defined for evaluating test cases that users can choose from: error quantity density and error type density. SGMT-CCS consists of three modules: test case automatic generation module, test case selection module, and test case priority sorting module. Users can: 1) customize the minimum error interval and the size of the test case set during the automated generation of test cases; 2) customize the minimum error interval and expected value during the selection process; 3) select different criteria for evaluating and prioritizing test cases to meet the requirements of different datasets. Experimental results show that SGMT-CCS can generate effective test cases in a short period of time. The selection module satisfies the user’s customized goals under simulated requirements, and the priority sorting module effectively improves test efficiency in different time periods under different evaluation criteria than before sorting.

Key words: test case generation, Chinese text error correction, selective generation, regression test, Natural Language Processing (NLP)

中图分类号: