Journal of Computer Applications ›› 2026, Vol. 46 ›› Issue (6): 1811-1817.DOI: 10.11772/j.issn.1001-9081.2025060745
• Data science and technology • Previous Articles
Yusheng YI, Zhaohao HUANG, Zihao DENG, Leilei KONG(
), Haoliang QI
Received:2025-07-08
Revised:2025-09-18
Accepted:2025-09-22
Online:2025-10-16
Published:2026-06-10
Contact:
Leilei KONG
About author:YI Yusheng, born in 1989, Ph. D., lecturer. His research interests include natural language processing, IT application innovation.Supported by:通讯作者:
孔蕾蕾
作者简介:易宇声(1989—),男,湖南益阳人,讲师,博士,CCF会员,主要研究方向:自然语言处理、信息技术应用创新基金资助:CLC Number:
Yusheng YI, Zhaohao HUANG, Zihao DENG, Leilei KONG, Haoliang QI. CORER: collaborative multi-knowledge large language model prompt framework for IT application innovation database migration[J]. Journal of Computer Applications, 2026, 46(6): 1811-1817.
易宇声, 黄兆豪, 邓梓昊, 孔蕾蕾, 齐浩亮. 面向信创数据库迁移的多知识库协同大语言模型提示框架CORER[J]. 《计算机应用》唯一官方网站, 2026, 46(6): 1811-1817.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2025060745
| 功能特性 | 语法数 | 示例 |
|---|---|---|
| DDL | 126 | CREATE、ALTER |
| DML | 15 | INSERT、UPDATE、DELETE |
| DQL | 5 | UNION、INTERSECT |
| TCL | 11 | BEGIN、COMMIT/END |
| DCL | 25 | REVOKE、SET ROLE |
| 性能与维护 | 28 | EXPLAIN、EXPLAIN PLAN |
| 安全管理 | 17 | ALTER AUDIT POLICY |
| 游标与预处理语句 | 9 | DECLARE、CURSOR |
| 高级特性 | 21 | CREATE PACKAGE与 DROP PACKAGE |
| 资源与扩展 | 14 | CREATE RESOURCE与 DROP RESOURCE |
| 企业版特性 | 19 | CREATE、ALTER、DROP |
Tab. 1 Statistics of openGauss syntax types
| 功能特性 | 语法数 | 示例 |
|---|---|---|
| DDL | 126 | CREATE、ALTER |
| DML | 15 | INSERT、UPDATE、DELETE |
| DQL | 5 | UNION、INTERSECT |
| TCL | 11 | BEGIN、COMMIT/END |
| DCL | 25 | REVOKE、SET ROLE |
| 性能与维护 | 28 | EXPLAIN、EXPLAIN PLAN |
| 安全管理 | 17 | ALTER AUDIT POLICY |
| 游标与预处理语句 | 9 | DECLARE、CURSOR |
| 高级特性 | 21 | CREATE PACKAGE与 DROP PACKAGE |
| 资源与扩展 | 14 | CREATE RESOURCE与 DROP RESOURCE |
| 企业版特性 | 19 | CREATE、ALTER、DROP |
| 二级分类 | 语法内容 |
|---|---|
| 功能描述 | BEGIN可以用于开始一个匿名块,也可以用于开始 一个事务…… |
| 注意事项 | 在gsql中使用BEGIN开启事务执行SQL语句时, 勿将BEGIN与需要执行的SQL语句写在一行…… |
| 语法格式 | 1)开启匿名块:[DECLARE [declare_statements]] BEGIN execution_statements END; 2)开启事务:…… |
| 参数说明 | 1)declare_statements:声明变量,包括变量名和变量类型,如“sales_cnt int”。 2)execution_statements:…… |
Tab. 2 BEGIN syntax rule knowledge units
| 二级分类 | 语法内容 |
|---|---|
| 功能描述 | BEGIN可以用于开始一个匿名块,也可以用于开始 一个事务…… |
| 注意事项 | 在gsql中使用BEGIN开启事务执行SQL语句时, 勿将BEGIN与需要执行的SQL语句写在一行…… |
| 语法格式 | 1)开启匿名块:[DECLARE [declare_statements]] BEGIN execution_statements END; 2)开启事务:…… |
| 参数说明 | 1)declare_statements:声明变量,包括变量名和变量类型,如“sales_cnt int”。 2)execution_statements:…… |
| 类型 | 组元 | 内容 |
|---|---|---|
| 语句模板 | Template | SELECT [DISTINCT] <column1>, <column2>, …, <columnN> FROM < [INNER JOIN < [LEFT JOIN < [RIGHT JOIN < [JOIN < WHERE <condition> … |
| 迁移样本 | SourceSQL | SELECT * FROM users WHERE name LIKE CONCAT('%', ?, '%') AND age BETWEEN? AND? ORDER BY created_at DESC LIMIT 10 |
| TargetSQL | SELECT * FROM users WHERE name LIKE '%' || ? || '%' AND age BETWEEN? AND? ORDER BY created_at DESC LIMIT 10 | |
| ErrorContext | MySQL使用CONCAT函数进行字符串拼接,而openGauss使用'||'作为字符串连接符,因此无法自动转换 |
Tab. 3 SELECT syntax migration sample knowledge units
| 类型 | 组元 | 内容 |
|---|---|---|
| 语句模板 | Template | SELECT [DISTINCT] <column1>, <column2>, …, <columnN> FROM < [INNER JOIN < [LEFT JOIN < [RIGHT JOIN < [JOIN < WHERE <condition> … |
| 迁移样本 | SourceSQL | SELECT * FROM users WHERE name LIKE CONCAT('%', ?, '%') AND age BETWEEN? AND? ORDER BY created_at DESC LIMIT 10 |
| TargetSQL | SELECT * FROM users WHERE name LIKE '%' || ? || '%' AND age BETWEEN? AND? ORDER BY created_at DESC LIMIT 10 | |
| ErrorContext | MySQL使用CONCAT函数进行字符串拼接,而openGauss使用'||'作为字符串连接符,因此无法自动转换 |
| 语句功能特性 | 语句示例 | SQL语句数 | 占比/% |
|---|---|---|---|
| 数据定义语言(DDL) | CREATE TABLE, ALTER INDEX | 2 101 | 20.60 |
| 数据操作语言(DML) | INSERT, UPDATE, DELETE | 3 826 | 37.40 |
| 数据查询语言(DQL) | SELECT(含JOIN/子查询) | 1 848 | 18.10 |
| 事务控制语言(TCL) | COMMIT, SAVEPOINT | 614 | 6.00 |
| 数据控制语言(DCL) | GRANT, REVOKE | 307 | 3.00 |
| 性能与维护 | ANALYZE TABLE, OPTIMIZE | 512 | 5.00 |
| 高级特性 | 存储过程/触发器/窗口函数 | 759 | 7.50 |
| 资源与扩展 | CREATE TABLESPACE | 250 | 2.40 |
Tab. 4 Experimental dataset
| 语句功能特性 | 语句示例 | SQL语句数 | 占比/% |
|---|---|---|---|
| 数据定义语言(DDL) | CREATE TABLE, ALTER INDEX | 2 101 | 20.60 |
| 数据操作语言(DML) | INSERT, UPDATE, DELETE | 3 826 | 37.40 |
| 数据查询语言(DQL) | SELECT(含JOIN/子查询) | 1 848 | 18.10 |
| 事务控制语言(TCL) | COMMIT, SAVEPOINT | 614 | 6.00 |
| 数据控制语言(DCL) | GRANT, REVOKE | 307 | 3.00 |
| 性能与维护 | ANALYZE TABLE, OPTIMIZE | 512 | 5.00 |
| 高级特性 | 存储过程/触发器/窗口函数 | 759 | 7.50 |
| 资源与扩展 | CREATE TABLESPACE | 250 | 2.40 |
| 方法 | 准确率 | 平均准确率 | |||||||
|---|---|---|---|---|---|---|---|---|---|
| DDL | DML | DQL | TCL | DCL | 性能与维护 | 高级特性 | 资源与扩展 | ||
| UGO | 0.947 1 | 0.935 3 | 0.915 2 | 0.961 4 | 0.970 5 | 0.921 6 | 0.813 8 | 0.905 7 | 0.921 3 |
| CORER | 0.950 2 | 0.932 1 | 0.935 4 | 0.955 7 | 0.965 3 | 0.945 2 | 0.870 9 | 0.920 4 | 0.934 4 |
Tab. 5 Comparison experimental results
| 方法 | 准确率 | 平均准确率 | |||||||
|---|---|---|---|---|---|---|---|---|---|
| DDL | DML | DQL | TCL | DCL | 性能与维护 | 高级特性 | 资源与扩展 | ||
| UGO | 0.947 1 | 0.935 3 | 0.915 2 | 0.961 4 | 0.970 5 | 0.921 6 | 0.813 8 | 0.905 7 | 0.921 3 |
| CORER | 0.950 2 | 0.932 1 | 0.935 4 | 0.955 7 | 0.965 3 | 0.945 2 | 0.870 9 | 0.920 4 | 0.934 4 |
| 方法 | 准确率 | 平均准确率 | |||||||
|---|---|---|---|---|---|---|---|---|---|
| DDL | DML | DQL | TCL | DCL | 性能/维护 | 高级特性 | 资源/扩展 | ||
| CORER | 0.950 2 | 0.932 1 | 0.935 4 | 0.955 7 | 0.965 3 | 0.945 2 | 0.870 9 | 0.920 4 | 0.934 4 |
| CORER-Rules⁻ | 0.913 7 | 0.896 5 | 0.882 3 | 0.921 4 | 0.940 2 | 0.874 5 | 0.723 1 | 0.856 2 | 0.762 9 |
| CORER-Examples⁻ | 0.931 5 | 0.918 2 | 0.901 7 | 0.932 5 | 0.951 1 | 0.897 3 | 0.755 6 | 0.889 5 | 0.781 9 |
| CORER-Base | 0.854 2 | 0.863 9 | 0.837 1 | 0.880 3 | 0.902 4 | 0.801 5 | 0.690 8 | 0.783 3 | 0.720 9 |
Tab. 6 Ablation experimental results
| 方法 | 准确率 | 平均准确率 | |||||||
|---|---|---|---|---|---|---|---|---|---|
| DDL | DML | DQL | TCL | DCL | 性能/维护 | 高级特性 | 资源/扩展 | ||
| CORER | 0.950 2 | 0.932 1 | 0.935 4 | 0.955 7 | 0.965 3 | 0.945 2 | 0.870 9 | 0.920 4 | 0.934 4 |
| CORER-Rules⁻ | 0.913 7 | 0.896 5 | 0.882 3 | 0.921 4 | 0.940 2 | 0.874 5 | 0.723 1 | 0.856 2 | 0.762 9 |
| CORER-Examples⁻ | 0.931 5 | 0.918 2 | 0.901 7 | 0.932 5 | 0.951 1 | 0.897 3 | 0.755 6 | 0.889 5 | 0.781 9 |
| CORER-Base | 0.854 2 | 0.863 9 | 0.837 1 | 0.880 3 | 0.902 4 | 0.801 5 | 0.690 8 | 0.783 3 | 0.720 9 |
| [1] | 民政部信息中心. 信创产业发展的重要意义[EB/OL]. [2025-03-11].. |
| Information Center of Ministry of Civil Affairs. The importance of the development of the information technology application innovation industry[EB/OL]. [2025-03-11].. | |
| [2] | 苏莹,杨文波,张景云. 国产数据库替换中的常见挑战与应用策略 [J]. 信息系统工程, 2025(1): 99-102. |
| SU Y, YANG W B, ZHANG J Y. Common challenges and application strategies in domestic database replacement[J]. China CIO News, 2025(1): 99-102. | |
| [3] | THALHEIM B, WANG Q. Data migration: a theoretical perspective[J]. Data and Knowledge Engineering, 2013, 87: 260-278. |
| [4] | LASZEWSKI T, NAUDURI P. Identifying the level of effort and cost[M]// Migrating to the cloud: oracle client/server modernization. Rockland, MA: Syngress, 2011: 21-43. |
| [5] | SALUNKE S V, OUDA A. A performance benchmark for the PostgreSQL and MySQL databases[J]. Future Internet, 2024, 16(10): No.382. |
| [6] | 阿里云. 数据传输服务DTS[EB/OL]. [2025-05-19].. |
| Aliyun. Data transmission service DTS[EB/OL]. [2025-05-19].. | |
| [7] | HAGOS D H, BATTLE R, RAWAT D B. Recent advances in generative AI and large language models: current status, challenges, and perspectives[J]. IEEE Transactions on Artificial Intelligence, 2024, 5(12): 5873-5893. |
| [8] | CHEN M, TWOREK J, JUN H, et al. Evaluating large language models trained on code[EB/OL]. [2025-05-19].. |
| [9] | ZHU X, LI Q, CUI L, et al. Large language model enhanced text-to-SQL generation: a survey[EB/OL]. [2025-03-12].. |
| [10] | ZHOU X, SUN Z, LI G. DB-GPT: large language model meets database[J]. Data Science and Engineering, 2024, 9(1): 102-111. |
| [11] | GOYAL M K, CHATURVEDI R, SUNDARAMOORTHY P, et al. Leveraging generative AI for database migration: a comprehensive approach for heterogeneous migrations[EB/OL]. [2025-05-22].. |
| [12] | NGOM A L, KRASKA T. Mallet: SQL dialect translation with LLM rule generation[C]// Proceedings of the 7th International Workshop on Exploiting Artificial Intelligence Techniques for Data Management. New York: ACM, 2024: No.3. |
| [13] | MySQL. MySQL homepage[EB/OL]. [2025-05-19].. |
| [14] | 李国良,王江,陈国. openGauss:企业级开源数据库系统[J]. 计算机科学技术学报, 2024, 39(5): 1007-1028. |
| LI G L, WANG J, CHEN G. openGauss: enterprise-level open source database system[J]. Journal of Computer Science and Technology, 2024, 39(5): 1007-1028. | |
| [15] | openGauss. SQL语法[EB/OL]. [2025-05-19]. |
| 语法.html. (openGauss. SQL syntax[EB/OL]. [2025-05-19]. 语法.html.) | |
| [16] | SQLGlot. SQLGlot homepage[EB/OL]. [2025-05-28].. |
| [17] | jOOQ. jOOQ homepage[EB/OL]. [2025-05-28].. |
| [18] | SQLines. SQLines homepage[EB/OL]. [2025-05-28].. |
| [19] | ELAMPARITHI M, ANURATHA V. A review on database migration strategies, techniques and tools[J]. World Journal of Computer Application and Technology, 2015, 3(3): 41-48. |
| [20] | 华为云. 数据库和应用迁移UGO[EB/OL]. [2025-05-19].. |
| Cloud Huawei. Database and application migration UGO [EB/OL]. [2025-05-19].. | |
| [21] | Cloud Google. Gemini helps migrate Oracle to PostgreSQL on Google Cloud[EB/OL]. [2025-03-22].. |
| [22] | ZHOU W, GAO Y, ZHOU X, et al. Cracking SQL barriers: an LLM-based dialect translation system[J]. Proceedings of the ACM on Management of Data, 2025, 3(3): No.141. |
| [23] | BROWN T B, MANN B, RYDER N, et al. Language models are few-shot learners[C]// Proceedings of the 34th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2020: 1877-1901. |
| [24] | KATHARE N, REDDY O V, PRABHU V. A comprehensive study of Elasticsearch[J]. International Journal of Science and Research, 2021, 10(6): 716-720. |
| [25] | REIMERS N, GUREVYCH I. Sentence-BERT: sentence embeddings using Siamese BERT-networks[C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Stroudsburg: ACL, 2019: 3982-3992. |
| [26] | ROBERTSON S, ZARAGOZA H. The probabilistic relevance framework: BM25 and beyond[J]. Foundations and Trends® in Information Retrieval, 2009, 3(4): 333-389. |
| [27] | TOUVRON H, LAVRIL T, IZACARD G, et al. LLaMA: open and efficient foundation language models[EB/OL]. [2025-01-02].. |
| [1] | Longyu XIONG, Shengdong DU, Haochen SHI, Jie HU, Yan YANG, Tianrui LI. Government affairs hotline question answering system based on knowledge-enhanced large language model architecture [J]. Journal of Computer Applications, 2026, 46(6): 1721-1727. |
| [2] | Jintao WANG, Zhilin GAO, Qixiang MENG, Fanliang BU. Legal case retrieval method via case information reformulation using large language model [J]. Journal of Computer Applications, 2026, 46(6): 1785-1792. |
| [3] | Taixin CAI, Fengfeng WEI. Large language model-enhanced ant colony optimization for multi-solution traveling salesman problems [J]. Journal of Computer Applications, 2026, 46(6): 1712-1720. |
| [4] | Jiali ZHENG, Gang ZHOU, Jing CHEN, Shunhang LI. Adaptive multi-feature fusion detection method for AI-generated text [J]. Journal of Computer Applications, 2026, 46(5): 1433-1440. |
| [5] | Xing SHENG, Sunxian WENG, Kuosong CHEN, Zhongping WANG, Ruifeng REN, Yong LIU. Deep learning-based patent value evaluation for power grid enterprises [J]. Journal of Computer Applications, 2026, 46(5): 1468-1474. |
| [6] | Qianfei WANG, Yang LI, Deyu LI, Suge WANG. Dual-channel feature fusion representation method for short-text clustering based on large language model [J]. Journal of Computer Applications, 2026, 46(5): 1441-1449. |
| [7] | Xiaoyu WANG, Xin LI, Di XUE, Zhangtao JIANG, Wei WANG, Yanjun XIAO. Vulnerability classification framework for video surveillance network security based on large language models [J]. Journal of Computer Applications, 2026, 46(4): 1158-1170. |
| [8] | Kaizhou SHI, Xuan HE, Guoyi HOU, Gen LI, Shuanggao LI, Xiang HUANG. Airborne product metrological traceability knowledge graph construction method based on large language models [J]. Journal of Computer Applications, 2026, 46(4): 1086-1095. |
| [9] | Haoyang ZHANG, Liping ZHANG, Sheng YAN, Na LI, Xuefei ZHANG. Review of large language model methods for knowledge graph completion [J]. Journal of Computer Applications, 2026, 46(3): 683-695. |
| [10] | Bin SHEN, Xiaoning CHEN, Hua CHENG, Yiquan FANG, Huifeng WANG. Intelligent undergraduate teaching evaluation system based on large language models [J]. Journal of Computer Applications, 2026, 46(3): 993-1003. |
| [11] | Enkang XI, Jing FAN, Yadong JIN, Hua DONG, Hao YU, Yihang SUN. Review of threats faced by federated learning in privacy and security field [J]. Journal of Computer Applications, 2026, 46(3): 798-808. |
| [12] | Yiming HUANG, Xihua ZOU, Guo DENG, Di ZHENG. Pre-answering and retrieval filtering: dual-stage optimization method for RAG-based question-answering systems [J]. Journal of Computer Applications, 2026, 46(3): 696-707. |
| [13] | Dingjia WU, Zhe CUI. MG-SQL: SQL generation framework with enhanced schema linking and multi-generator collaboration [J]. Journal of Computer Applications, 2026, 46(3): 723-731. |
| [14] | Rilong WANG, Zhenping LI, Xiaosong LI, Qiang GAO, Ya HE, Yong ZHONG, Yingxiao ZHAO. Multi-Agent collaborative knowledge reasoning framework [J]. Journal of Computer Applications, 2026, 46(3): 708-714. |
| [15] | Fei GAO, Dong CHEN, Dixing BIAN, Wenqiang FAN, Qidong LIU, Pei LYU, Chaoyang ZHANG, Mingliang XU. Multistage coupled decision-making framework for researcher redeployment after discipline revocation [J]. Journal of Computer Applications, 2026, 46(2): 416-426. |
| Viewed | ||||||
|
Full text |
|
|||||
|
Abstract |
|
|||||