Journal of Computer Applications ›› 2024, Vol. 44 ›› Issue (1): 101-112.DOI: 10.11772/j.issn.1001-9081.2023010080
• Artificial intelligence • Previous Articles
Chenghao FENG1, Zhenping XIE1,2(), Bowen DING1
Received:
2023-02-06
Revised:
2023-03-28
Accepted:
2023-03-29
Online:
2023-06-06
Published:
2024-01-10
Contact:
Zhenping XIE
About author:
FENG Chenghao, born in 1997, M. S. candidate. His research interests include intelligent system software.Supported by:
通讯作者:
谢振平
作者简介:
冯程皓(1997—),男,河南焦作人,硕士研究生,主要研究方向:智能系统软件;基金资助:
CLC Number:
Chenghao FENG, Zhenping XIE, Bowen DING. Selective generation method of test cases for Chinese text error correction software[J]. Journal of Computer Applications, 2024, 44(1): 101-112.
冯程皓, 谢振平, 丁博文. 中文文本纠错软件测试用例的选择生成方法[J]. 《计算机应用》唯一官方网站, 2024, 44(1): 101-112.
Add to citation manager EndNote|Ris|BibTeX
URL: http://www.joca.cn/EN/10.11772/j.issn.1001-9081.2023010080
函数 | 参数 | 作用 |
---|---|---|
Init | 文本本身集合 文本分词数集合 错误个数集合 错误种类频率集合 | 初始化 |
Generate | 原文本测试集大小 | 初始化和声 |
Generate_alter | 原文本测试集大小 | 迭代新和声 |
CalculateFitness_0 | NULL | 生成适应度 |
CalculateFitness_1 | NULL | 生成适应度 |
Tab. 1 Initialization function information
函数 | 参数 | 作用 |
---|---|---|
Init | 文本本身集合 文本分词数集合 错误个数集合 错误种类频率集合 | 初始化 |
Generate | 原文本测试集大小 | 初始化和声 |
Generate_alter | 原文本测试集大小 | 迭代新和声 |
CalculateFitness_0 | NULL | 生成适应度 |
CalculateFitness_1 | NULL | 生成适应度 |
实验组序号 | 用例集大小/103 | 用例集数 |
---|---|---|
1 | 10 | 10 |
2 | 100 | 10 |
3 | 1 000 | 10 |
Tab. 2 Experiment parameter settings for time cost and stability of AGM
实验组序号 | 用例集大小/103 | 用例集数 |
---|---|---|
1 | 10 | 10 |
2 | 100 | 10 |
3 | 1 000 | 10 |
纠错软件 | 用例集大小/103 | 用例集数 |
---|---|---|
讯飞 | 10 | 10 |
100 | 10 | |
讯飞和百度 | 10 | 10 |
Tab. 3 Experiment parameter settings for effectiveness and versatility of cases generated by AGM
纠错软件 | 用例集大小/103 | 用例集数 |
---|---|---|
讯飞 | 10 | 10 |
100 | 10 | |
讯飞和百度 | 10 | 10 |
实验序号 | 错误数量密度 | 错误种类密度 | ||
---|---|---|---|---|
错误最小间隔 | 期望值 | 错误最小间隔 | 期望值 | |
1 | 3 | 0.20 | 2 | 0.20 |
2 | 4 | 0.20 | 3 | 0.20 |
3 | 5 | 0.20 | 4 | 0.20 |
4 | 2 | 0.20 | 5 | 0.20 |
5 | 6 | 0.20 | 6 | 0.20 |
6 | 2 | 0.20 | 3 | 0.20 |
7 | 2 | 0.30 | 3 | 0.15 |
8 | 2 | 0.25 | 3 | 0.10 |
9 | 2 | 0.35 | 3 | 0.25 |
10 | 2 | 0.40 | 3 | 0.30 |
Tab. 4 SM experiment parameter
实验序号 | 错误数量密度 | 错误种类密度 | ||
---|---|---|---|---|
错误最小间隔 | 期望值 | 错误最小间隔 | 期望值 | |
1 | 3 | 0.20 | 2 | 0.20 |
2 | 4 | 0.20 | 3 | 0.20 |
3 | 5 | 0.20 | 4 | 0.20 |
4 | 2 | 0.20 | 5 | 0.20 |
5 | 6 | 0.20 | 6 | 0.20 |
6 | 2 | 0.20 | 3 | 0.20 |
7 | 2 | 0.30 | 3 | 0.15 |
8 | 2 | 0.25 | 3 | 0.10 |
9 | 2 | 0.35 | 3 | 0.25 |
10 | 2 | 0.40 | 3 | 0.30 |
实验序号 | 评判标准 | 用例集大小/103 | 用例集数 |
---|---|---|---|
1 | 错误数量密度 | 100 | 10 |
2 | 错误种类密度 | 100 | 10 |
Tab. 5 Experiment parameters of prioritization module
实验序号 | 评判标准 | 用例集大小/103 | 用例集数 |
---|---|---|---|
1 | 错误数量密度 | 100 | 10 |
2 | 错误种类密度 | 100 | 10 |
生成方法 | 用例集应用场景 | 是否考虑用例集优化 | 是否可以重用 |
---|---|---|---|
SGMT-CCS | 任意大小的用例集 | 是 | 是 |
手动生成 | 小型用例集 | 否 | 否 |
半自动生成 | 小型用例集(理论上可以生成较大型用例集) | 否 | 否 |
Tab. 6 Application attribute comparison of Chinese text generation methods
生成方法 | 用例集应用场景 | 是否考虑用例集优化 | 是否可以重用 |
---|---|---|---|
SGMT-CCS | 任意大小的用例集 | 是 | 是 |
手动生成 | 小型用例集 | 否 | 否 |
半自动生成 | 小型用例集(理论上可以生成较大型用例集) | 否 | 否 |
用例集大小/103 | 中文文本生成方法 | 需求分析+生成字词表的时间/s | 结合字词表生成用例时间/s |
---|---|---|---|
101 | SGMT-CCS | 0 | ≈15 |
手动生成 | ≥600 | ≥10×103 | |
半自动生成 | ≥600 | ≈15 | |
102 | SGMT-CCS | 0 | ≈100 |
手动生成 | ≥600 | ≥100×103 | |
半自动生成 | ≥600 | ≈100 | |
103 | SGMT-CCS | 0 | ≈1 000 |
手动生成 | ≥600 | ≥1 000×103 | |
半自动生成 | ≥600 | ≈1 000 |
Tab. 7 Time cost comparison of Chinese text generation methods
用例集大小/103 | 中文文本生成方法 | 需求分析+生成字词表的时间/s | 结合字词表生成用例时间/s |
---|---|---|---|
101 | SGMT-CCS | 0 | ≈15 |
手动生成 | ≥600 | ≥10×103 | |
半自动生成 | ≥600 | ≈15 | |
102 | SGMT-CCS | 0 | ≈100 |
手动生成 | ≥600 | ≥100×103 | |
半自动生成 | ≥600 | ≈100 | |
103 | SGMT-CCS | 0 | ≈1 000 |
手动生成 | ≥600 | ≥1 000×103 | |
半自动生成 | ≥600 | ≈1 000 |
队伍名称 | IP | IF |
---|---|---|
YNU-HPCC | 0.408 6 | 0.416 7 |
NTOUA | 0.388 9 | 0.439 8 |
CVTE | 0.606 0 | 0.297 8 |
BNU | 0.552 7 | 0.211 8 |
AL_I_NLP | 0.479 1 | 0.516 4 |
Tab. 8 Error correction accuracies of teams participating in Chinese Grammatical Error Diagnosis-2017
队伍名称 | IP | IF |
---|---|---|
YNU-HPCC | 0.408 6 | 0.416 7 |
NTOUA | 0.388 9 | 0.439 8 |
CVTE | 0.606 0 | 0.297 8 |
BNU | 0.552 7 | 0.211 8 |
AL_I_NLP | 0.479 1 | 0.516 4 |
1 | 陈德光,马金林,马自萍,等.自然语言处理预训练技术综述 [J].计算机科学与探索, 2021, 15(8): 1359-1389. |
CHEN D G, MA J L, MA Z P, et al. Review of pre-training techniques for natural language processing [J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(8): 1359-1389. | |
2 | 丁雅婷,伍麟.自然语言处理预测抑郁症的技术陷阱与道德风险[J].心理科学, 2022, 45(5): 1267-1272. |
DING Y T, WU L. Technology trap and moral hazard of natural language processing in predicting depression [J]. Journal of Psychological Science, 2022, 45(5): 1267-1272. | |
3 | 王颖洁,朱久祺,汪祖民,等.自然语言处理在文本情感分析领域应用综述[J].计算机应用, 2022, 42(4): 1011-1020. |
WANG Y J, ZHU J Q, WANG Z M, et al. Review of applications of natural language processing in text sentiment analysis [J]. Journal of Computer Applications, 2022, 42(4): 1011-1020. | |
4 | 周原.基于自然语言处理的纠错系统架构设计[J].太原师范学院学报(自然科学版), 2022, 21(3): 37-41, 46. |
ZHOU Y. Architecture design of error correction system based on natural language processing [J]. Journal of Taiyuan Normal University (Natural Science Edition), 2022, 21(3): 37-41, 46. | |
5 | 杨暑东.Emoji自然语言处理综述[J].计算机应用与软件, 2022, 39(9): 11-20, 44. 10.3969/j.issn.1000-386x.2022.09.002 |
YANG S D. Survey on Emoji-embedded natural language processing [J]. Computer Applications and Software, 2022, 39(9): 11-20, 44. 10.3969/j.issn.1000-386x.2022.09.002 | |
6 | 王晓琳,曾红卫,林玮玮.敏捷开发环境中的回归测试优化技术[J].计算机学报, 2019, 42(10): 2323-2338. 10.11897/SP.J.1016.2019.02323 |
WANG X L, ZENG H W, LIN W W. Techniques for regression testing in agile development environment [J]. Chinese Journal of Computers, 2019, 42(10): 2323-2338. 10.11897/SP.J.1016.2019.02323 | |
7 | 邓永康.基于神经机器翻译的中文文本纠错研究[D].武汉:武汉大学, 2020: 32-40. |
DENG Y K. Research of Chinese text correction based on neural machine translation [D]. Wuhan: Wuhan University, 2020: 32-40. | |
8 | CHEN L, LI Q. Automated test case generation from use case: a model based approach [C]// Proceedings of the 2010 3rd International Conference on Computer Science and Information Technology. Piscataway: IEEE, 2010: 372-377. 10.1109/iccsit.2010.5563772 |
9 | SABER T, DELAVERNHE F, PAPADAKIS M, et al. A hybrid algorithm for multi-objective test case selection [C]// Proceedings of the 2018 IEEE Congress on Evolutionary Computation. Piscataway: IEEE, 2018: 225-237. 10.1109/cec.2018.8477875 |
10 | TYAGI M, MALHOTRA S. Test case prioritization using multi objective particle swarm optimizer [C]// Proceedings of the 2014 International Conference on Signal Propagation and Computer Technology. Piscataway: IEEE, 2014: 390-395. 10.1109/icspct.2014.6884931 |
11 | EPITROPAKIS M G, YOO S, HARMAN M, et al. Empirical evaluation of Pareto efficient multi-objective regression test case prioritisation [C]// Proceedings of the 2015 International Symposium on Software Testing and Analysis. New York: ACM, 2015: 234-245. 10.1145/2771783.2771788 |
12 | 王廷永,黄松.测试用例自动生成技术综述[J].电子技术与软件工程, 2021(18): 51-53. |
WANG T Y, HUANG S. A survey of test case automatic generation technology [J]. Electronic Technology & Software Engineering, 2021(18): 51-53. | |
13 | DURAN J W, NTAFOS S C. An evaluation of random testing [J]. IEEE Transactions on Software Engineering, 1984, SE-10(4): 438-444. 10.1109/tse.1984.5010257 |
14 | CHEN T Y, F-C KUO, LIU H, et al. Code coverage of adaptive random testing [J]. IEEE Transactions on Reliability, 2013, 62(1): 226-237. 10.1109/tr.2013.2240898 |
15 | GANESH V, KIEZUN A, ARTZI S, et al. HAMPI: A string solver for testing analysis and vulnerability detection [C]// Proceedings of the 23rd International Conference on Computer Aided Verification. Berlin: Springer, 2011: 1-19. 10.1007/978-3-642-22110-1_1 |
16 | HARMAN M, McMINN P. A theoretical and empirical study of search-based testing: local global and hybrid search [J]. IEEE Transactions on Software Engineering, 2010, 36(2): 226-247. 10.1109/tse.2009.71 |
17 | HEMMATI H, ARCURI A, BRIAND L. Achieving scalable model-based testing through test case diversity [J]. ACM Transactions on Software Engineering and Methodology, 2013, 22(1): No.6. 10.1145/2430536.2430540 |
18 | DAMIA A H, ESNAASHARI M M. Automated test data generation using a combination of firefly algorithm and asexual reproduction optimization algorithm [J]. International Journal of Web Research, 2020, 3(1): 19-28. |
19 | ROTHERMEL G, HARROLD M J. Analyzing regression test selection techniques [J]. IEEE Transactions on Software Engineering, 1996, 22(8): 529-551. 10.1109/32.536955 |
20 | 陈晓琪,谢振平,刘渊,等.基于动态赋权近邻传播的数据增量采样方法[J].软件学报, 2021, 32(12): 3884-3900. |
CHEN X Q, XIE Z P, LIU Y, et al. Incremental data sampling method using affinity propagation with dynamic weighting [J]. Journal of Software, 2021, 32(12): 3884-3900. | |
21 | 程雪梅,杨秋辉,翟宇鹏,等.基于半监督聚类方法的测试用例选择技术[J].计算机科学, 2018, 45(1): 249-254. 10.11896/j.issn.1002-137X.2018.01.044 |
CHENG X M, YANG Q H, ZHAI Y P, et al. Test case selection technique based on semi-supervised clustering method [J]. Computer Science, 2018, 45(1): 249-254. 10.11896/j.issn.1002-137X.2018.01.044 | |
22 | GUPTA N, SHARMA A, PACHARIYA M K. An insight into test case optimization: ideas and trends with future perspectives [J]. IEEE Access, 2019, 7: 22310-22327. 10.1109/access.2019.2899471 |
23 | MAIA C L B, CARMO R A F D, FREITAS F G D, et al. A multi-objective approach for the regression test case selection problem [C]// Proceedings of the XLI Simpsio Brasileiro de Pesquisa Operacional. Rio de Janeiro: SOBRAPO, 2009: 1824-1835. |
24 | SOUZA L, PRUDÊNCIO R, BARROS F. Multi-objective test case selection: a study of the influence of the catfish effect on PSO based strategies [C]// Proceedings of the 2014 Anais do Workshop de Testes e Tolerância a Falhas. Porto Alegre: Sociedade Brasileira de Computação, 2014: 3-16. 10.5753/wtf.2014.22943 |
25 | CHOUDHARY A, AGRAWAL A P, KAUR A. An effective approach for regression test case selection using Pareto based multi-objective harmony search [C]// Proceedings of the 2018 IEEE/ACM 11th International Workshop on Search-Based Software Testing. New York: ACM, 2018: 13-20. 10.1145/3194718.3194722 |
26 | 屈波,聂长海,徐宝文.回归测试中测试用例优先级技术研究综述 [J].计算机科学与探索, 2009, 3(3): 225-233. 10.3724/sp.j.1016.2008.00431 |
QU B, NIE C H, XU B W. Survey of test case prioritization for regression testing [J]. Journal of Frontiers of Computer Science and Technology, 2009, 3(3): 225-233. 10.3724/sp.j.1016.2008.00431 | |
27 | 陈翔,陈继红,鞠小林,等.回归测试中的测试用例优先排序技术述评[J].软件学报, 2013, 24(8): 1695-1712. 10.3724/sp.j.1001.2013.04420 |
CHEN X, CHEN J H, JU X L, et al. Survey of test case prioritization techniques for regression testing [J]. Journal of Software, 2013, 24(8): 1695-1712. 10.3724/sp.j.1001.2013.04420 | |
28 | 李兴佳,杨秋辉,洪玫,等.基于历史数据和多目标优化的测试用例排序方法[J].计算机应用, 2023, 43(1): 221-226. |
LI X J, YANG Q H, HONG M, et al. Test case prioritization approach based on historical data and multi-objective optimization [J]. Journal of Computer Applications, 2023, 43(1): 221-226. | |
29 | AMMAR A, BAHAROM S, GHANI A A A, et al. The effectiveness of an enhanced weighted method with a unique priority value for test case prioritization in regression testing [J]. International Journal of Engineering & Technology, 2018, 7(4.31): 20-27. |
30 | MARCHETTO A, ISLAM M M, ASGHAR W, et al. A multi-objective technique to prioritize test cases [J]. IEEE Transactions on Software Engineering, 2016, 42(10): 918-940. 10.1109/tse.2015.2510633 |
31 | Y-H TSENG, LEE L-H, CHANG L-P, et al. Introduction to SIGHAN 2015 bake-off for Chinese spelling check [C]// Proceedings of the Eighth SIGHAN Workshop on Chinese Language Processing. Stroudsburg, PA: Association for Computational Linguistics, 2015: 27-32. 10.18653/v1/w15-3106 |
32 | GALEOTTI J P, FRASER G, ARCURI A. Extending a search-based test generator with adaptive dynamic symbolic execution [C]// Proceedings of the 2014 International Symposium on Software Testing and Analysis. New York: ACM, 2014: 421-424. 10.1145/2610384.2628049 |
33 | AZIZI M, DO H. Graphite: A greedy graph-based technique for regression test case prioritization [C]// Proceedings of the 2018 IEEE International Symposium on Software Reliability Engineering Workshops. Piscataway: IEEE, 2018: 245-251. 10.1109/issrew.2018.00014 |
34 | RAO G, ZHANG B, XUN E. IJCNLP-2017 task 1: Chinese grammatical error diagnosis [C]// Proceedings of the IJCNLP 2017. Taipei: Asian Federation of Natural Language Processing, 2017: 1-8. 10.18653/v1/w18-3706 |
[1] | Xiaomin ZHOU, Fei TENG, Yi ZHANG. Automatic international classification of diseases coding model based on meta-network [J]. Journal of Computer Applications, 2023, 43(9): 2721-2726. |
[2] | Xinyue ZHANG, Rong LIU, Chiyu WEI, Ke FANG. Aspect-based sentiment analysis method with integrating prompt knowledge [J]. Journal of Computer Applications, 2023, 43(9): 2753-2759. |
[3] | Zexi JIN, Lei LI, Ji LIU. Transfer learning model based on improved domain separation network [J]. Journal of Computer Applications, 2023, 43(8): 2382-2389. |
[4] | Yao LIU, Xin TONG, Yifeng CHEN. Algorithm path self-assembling model for business requirements [J]. Journal of Computer Applications, 2023, 43(6): 1768-1778. |
[5] | Zhongbo HU, Xupeng WANG. Multifactorial backtracking search optimization algorithm for solving automated test case generation problem [J]. Journal of Computer Applications, 2023, 43(4): 1214-1219. |
[6] | Xingbin LIAO, Xiaolin QIN, Siqi ZHANG, Yangge QIAN. Review of interactive machine translation [J]. Journal of Computer Applications, 2023, 43(2): 329-334. |
[7] | Ming XU, Linhao LI, Qiaoling QI, Liqin WANG. Abductive reasoning model based on attention balance list [J]. Journal of Computer Applications, 2023, 43(2): 349-355. |
[8] | Jianle CAO, Nana LI. Semantically enhanced sentiment classification model based on multi-level attention [J]. Journal of Computer Applications, 2023, 43(12): 3703-3710. |
[9] | LI Xingjia, YANG Qiuhui, HONG Mei, PAN Chunxia, LIU Ruihang. Test case prioritization approach based on historical data and multi-objective optimization [J]. Journal of Computer Applications, 2023, 43(1): 221-226. |
[10] | Meiying LIU, Qiuhui YANG, Xiao WANG, Chuang CAI. Test suite selection method based on commit prioritization and prediction model [J]. Journal of Computer Applications, 2022, 42(8): 2534-2539. |
[11] | Yingjie WANG, Jiuqi ZHU, Zumin WANG, Fengbo BAI, Jian GONG. Review of applications of natural language processing in text sentiment analysis [J]. Journal of Computer Applications, 2022, 42(4): 1011-1020. |
[12] | Yuqi DU, Jin ZHENG, Yang WANG, Cheng HUANG, Ping LI. Text segmentation model based on graph convolutional network [J]. Journal of Computer Applications, 2022, 42(12): 3692-3699. |
[13] | Longchao GONG, Junjun GUO, Zhengtao YU. Neural machine translation method based on source language syntax enhanced decoding [J]. Journal of Computer Applications, 2022, 42(11): 3386-3394. |
[14] | Yu PENG, Xiaoyu LI, Shijie HU, Xiaolei LIU, Weizhong QIAN. Three-stage question answering model based on BERT [J]. Journal of Computer Applications, 2022, 42(1): 64-70. |
[15] | LIU Yaxuan, ZHONG Yong. Joint extraction method of entities and relations based on subject attention [J]. Journal of Computer Applications, 2021, 41(9): 2517-2522. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||