基于大语言模型的本科教学评估智能系统

doi:10.11772/j.issn.1001-9081.2025030334

《计算机应用》唯一官方网站

• • 下一篇

基于大语言模型的本科教学评估智能系统

沈斌^1,2，陈晓宁¹，程华^2,3，房一泉¹，王慧锋²

1.华东理工大学信息化与数据管理中心 2. 华东理工大学信息科学与工程学院 3. 华东理工大学教务处

收稿日期:2025-03-31 修回日期:2025-05-07 发布日期:2025-05-13 出版日期:2025-05-13
通讯作者: 陈晓宁
作者简介:沈斌(1989—)，男，浙江绍兴人，工程师，硕士，主要研究方向：深度学习、人工智能及其应用；陈晓宁(1983—)，男，福建福州人，工程师，硕士，主要研究方向：大语言模型、信息安全；程华(1975—)，男，安徽黄山人，教授，博士，主要研究方向：自然语言处理、人工智能及其应用；房一泉(1975—)，女，江苏扬州人，正高级工程师，硕士，主要研究方向：信息安全；王慧锋(1969—)，女，黑龙江哈尔滨人，教授，博士，主要研究方向：智能感知。
基金资助:
教育部课题-高校评估数智化研究（P250603）

Intelligent undergraduate teaching evaluation system based on large language models

SHEN Bin^1,2, CHEN Xiaoning¹, CHENG Hua^2,3, FANG Yiquan¹, WANG Huifeng²

1. Informatization and Data Management Center, East China University of Science and Technology 2. School of Information Science and Engineering, East China University of Science and Technology 3. Undergraduate Academic Affairs Office, East China University of Science and Technology

Received:2025-03-31 Revised:2025-05-07 Online:2025-05-13 Published:2025-05-13
About author:SHEN Bin, born in 1989, M. S., engineer. His research interests include deep learning, artificial intelligence application. CHEN Xiaoning, born in 1983, M. S., engineer. His research interests include large language models, artificial intelligence application. CHENG Hua, born in 1975, Ph. D., professor. His research interests include natural large processing, artificial intelligence application. FANG Yiquan, born in 1975, M. S., professor-level senior engineer. Her research interests include information security. WANG Huifeng, born in 1969, Ph. D., professor. Her research interests include intelligent perception.
Supported by:
Ministry of Education Project: Research on the Digital and Intelligent Transformation of University Evaluation (P250603)

摘要/Abstract

摘要： 本科教学审核评估作为高等教育质量保障的重要手段，科学合理的实施直接影响高校人才培养水平。然而，传统人工审阅模式在面对海量异构数据时效率低下且主观性强，难以满足本科教学评估对精准性和标准化的需求。为此，提出一种基于大语言模型和多智能体架构的本科教学评估系统——智评宝（SmartEval）。该系统通过语义理解模块解析输入内容，并利用计划器进行任务分解与调度，同时结合检索增强生成模块及问答、摘要与诊断三类智能体，实现了对“数据采集—指标分析—决策支持”全流程的自动化处理。在2023年度部分高校本科教学评估的“1+3+3”系列报告基础上开展的实验结果表明，与GLM-4、qwen2.5等主流大语言模型相比，SmartEval在问答准确率、摘要Rouge-L值，以及诊断F1值等指标上均表现出显著优势。通过与专家组的一致性检验比对，进一步验证了它的结果的可靠性。

关键词: 本科教学评估, 大语言模型, 智能体, 数智化, 教育信息化

Abstract: As a critical component of higher education quality assurance, the scientific and rational implementation of undergraduate teaching audit and evaluation directly impacts the level of talent cultivation in universities. However, traditional manual review models are inefficient and subjective when faced with massive heterogeneous data, making it difficult to meet the demands for accuracy and standardization in undergraduate teaching evaluation. To address this, an intelligent undergraduate teaching evaluation system based on large language models and a multi-agent architecture—SmartEval was proposed. The system parses input content through a semantic understanding module, decomposes and schedules tasks using a planner, and integrates a retrieval-augmented generation module with three types of agents (question-answering, summarization, and diagnostics) to automate the entire process of "data collection—indicator analysis—decision support." Experimental validation based on the "1+3+3" series reports from the 2023 undergraduate teaching evaluation of selected universities demonstrates that SmartEval significantly outperforms existing mainstream large language models, such as GLM-4 and qwen2.5, in metrics such as question-answering accuracy, Rouge-L score for summarization, and F1 score for diagnostics. Furthermore, consistency tests with expert groups further validate the reliability of its results.

Key words: undergraduate teaching evaluation, large language model, agent, digital intelligence, educational informatization

中图分类号:

TP391.1

沈斌陈晓宁程华房一泉王慧锋. 基于大语言模型的本科教学评估智能系统[J]. 计算机应用, DOI: 10.11772/j.issn.1001-9081.2025030334.

SHEN Bin, CHEN Xiaoning, CHENG Hua, FANG Yiquan, WANG Huifeng. Intelligent undergraduate teaching evaluation system based on large language models[J]. Journal of Computer Applications, DOI: 10.11772/j.issn.1001-9081.2025030334.

[1]	张滨滨, 秦永彬, 黄瑞章, 陈艳平. 结合大语言模型与动态提示的裁判文书摘要方法[J]. 《计算机应用》唯一官方网站, 2025, 45(9): 2783-2789.
[2]	冯涛, 刘晨. 自动化偏好对齐的双阶段提示调优方法[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2442-2447.
[3]	孙熠衡, 刘茂福. 基于知识提示微调的标书信息抽取方法[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1169-1176.
[4]	秦小林, 古徐, 李弟诚, 徐海文. 大语言模型综述与展望[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 685-696.
[5]	袁成哲, 陈国华, 李丁丁, 朱源, 林荣华, 钟昊, 汤庸. ScholatGPT：面向学术社交网络的大语言模型及智能应用[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 755-764.
[6]	孙晨伟, 侯俊利, 刘祥根, 吕建成. 面向工程图纸理解的大语言模型提示生成方法[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 801-807.
[7]	董艳民, 林佳佳, 张征, 程程, 吴金泽, 王士进, 黄振亚, 刘淇, 陈恩红. 个性化学情感知的智慧助教算法设计与实践[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 765-772.
[8]	马灿, 黄瑞章, 任丽娜, 白瑞娜, 伍瑶瑶. 基于大语言模型的多输入中文拼写纠错方法[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 849-855.
[9]	鲁超峰, 陶冶, 文连庆, 孟菲, 秦修功, 杜永杰, 田云龙. 融合大语言模型和预训练模型的少量语料说话人-情感语音转换方法[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 815-822.
[10]	张学飞, 张丽萍, 闫盛, 侯敏, 赵宇博. 知识图谱与大语言模型协同的个性化学习推荐[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 773-784.
[11]	徐月梅, 叶宇齐, 何雪怡. 大语言模型的偏见挑战：识别、评估与去除[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 697-708.
[12]	杨燕, 叶枫, 许栋, 张雪洁, 徐津. 融合大语言模型和提示学习的数字孪生水利知识图谱构建[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 785-793.
[13]	盛坤, 王中卿. 基于大语言模型和数据增强的通感隐喻分析[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 794-800.
[14]	曹鹏, 温广琪, 杨金柱, 陈刚, 刘歆一, 季学纯. 面向测试用例生成的大模型高效微调方法[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 725-731.
[15]	何静, 沈阳, 谢润锋. 大语言模型幻觉现象的识别与优化[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 709-714.

基于大语言模型的本科教学评估智能系统

Intelligent undergraduate teaching evaluation system based on large language models

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics