《计算机应用》唯一官方网站 ›› 2025, Vol. 45 ›› Issue (3): 765-772.DOI: 10.11772/j.issn.1001-9081.2024101550
董艳民1, 林佳佳1, 张征1, 程程1, 吴金泽2, 王士进2, 黄振亚1,3(), 刘淇1,3, 陈恩红1
收稿日期:
2024-11-01
修回日期:
2024-12-25
接受日期:
2024-12-26
发布日期:
2025-02-07
出版日期:
2025-03-10
通讯作者:
黄振亚
作者简介:
董艳民(2000—),男,内蒙古赤峰人,硕士研究生,主要研究方向:代码检索、自然语言处理、大语言模型基金资助:
Yanmin DONG1, Jiajia LIN1, Zheng ZHANG1, Cheng CHENG1, Jinze WU2, Shijin WANG2, Zhenya HUANG1,3(), Qi LIU1,3, Enhong CHEN1
Received:
2024-11-01
Revised:
2024-12-25
Accepted:
2024-12-26
Online:
2025-02-07
Published:
2025-03-10
Contact:
Zhenya HUANG
About author:
DONG Yanmin, born in 2000, M. S. candidate. His research interests include code retrieval, natural language processing, large language model.Supported by:
摘要:
随着大语言模型(LLM)的快速发展,基于LLM的对话助手逐渐成为学生学习的新方式。通过学生的问答互动,对话助手能生成相应的解答,从而帮助学生解决问题,并提高学习效率。然而,现有的对话助手忽略了学生的个性化需求,无法为学生提供个性化的回答,实现“因材施教”。因此,提出一种基于学生能力感知的个性化对话助手框架。该框架包括2个主要模块:学生能力感知模块和个性化回答生成模块。能力感知模块通过分析学生的答题记录来挖掘学生的知识掌握程度,回答生成模块则根据学生的能力生成个性化回答。基于此框架,设计基于指令、基于小模型驱动和基于智能体Agent的3种实现范式,以深入探讨框架的实际效果。基于指令的对话助手利用LLM的推理能力,从学生的答题记录中挖掘知识掌握程度以帮助生成个性化回答;基于小模型驱动的对话助手利用深度知识追踪(DKT)模型生成学生的知识掌握程度;基于Agent的个性化对话助手采用LLM Agent的方式整合学生能力感知、个性化检测、答案修正等工具辅助答案的生成。基于ChatGLM(Chat General Language Model)、GPT4o_mini的对比实验结果表明,应用3种范式的LLM均能为学生提供个性化的回答,其中基于Agent的范式的准确度更高,表明该范式能更好地感知学生能力,并生成个性化回答。
中图分类号:
董艳民, 林佳佳, 张征, 程程, 吴金泽, 王士进, 黄振亚, 刘淇, 陈恩红. 个性化学情感知的智慧助教算法设计与实践[J]. 计算机应用, 2025, 45(3): 765-772.
Yanmin DONG, Jiajia LIN, Zheng ZHANG, Cheng CHENG, Jinze WU, Shijin WANG, Zhenya HUANG, Qi LIU, Enhong CHEN. Design and practice of intelligent tutoring algorithm based on personalized student capability perception[J]. Journal of Computer Applications, 2025, 45(3): 765-772.
项目 | Math | MOOCRadar |
---|---|---|
用户数 | 31 279 | 14 226 |
习题数 | 875 | 2 513 |
知识点数 | 184 | 34 |
答题日志数 | 422 585 | 456 456 |
平均每道题的知识点数 | 1.53 | 1.24 |
表1 实验数据集统计数据
Tab. 1 Experimental dataset statistics
项目 | Math | MOOCRadar |
---|---|---|
用户数 | 31 279 | 14 226 |
习题数 | 875 | 2 513 |
知识点数 | 184 | 34 |
答题日志数 | 422 585 | 456 456 |
平均每道题的知识点数 | 1.53 | 1.24 |
模型 | Math | MOOCRadar | ||
---|---|---|---|---|
ACC | MSE | ACC | MSE | |
Llama_8B | 0.536 | 0.464 | 0.614 | 0.391 |
Llama_8B_R | 0.545 | 0.455 | 0.627 | 0.380 |
Llama_8B_DKT | 0.551 | 0.447 | 0.645 | 0.354 |
Llama_8B_Agent | 0.554 | 0.441 | 0.653 | 0.340 |
Llama_70B | 0.549 | 0.450 | 0.646 | 0.352 |
Llama_70B_R | 0.567 | 0.441 | 0.656 | 0.338 |
Llama_70B_DKT | 0.574 | 0.426 | 0.668 | 0.327 |
Llama_70B_Agent | 0.582 | 0.422 | 0.687 | 0.312 |
Qwen | 0.511 | 0.487 | 0.581 | 0.422 |
Qwen_R | 0.525 | 0.475 | 0.599 | 0.403 |
Qwen_DKT | 0.527 | 0.474 | 0.616 | 0.390 |
Qwen_Agent | 0.534 | 0.463 | 0.622 | 0.384 |
GLM | 0.558 | 0.449 | 0.643 | 0.351 |
GLM_R | 0.574 | 0.426 | 0.688 | 0.312 |
GLM_DKT | 0.570 | 0.426 | 0.734 | 0.266 |
GLM_Agent | 0.617 | 0.385 | 0.793 | 0.207 |
GPT4o | 0.586 | 0.420 | 0.661 | 0.333 |
GPT4o_R | 0.587 | 0.419 | 0.680 | 0.320 |
GPT4o_DKT | 0.591 | 0.413 | 0.742 | 0.257 |
GPT4o_Agent | 0.595 | 0.410 | 0.769 | 0.231 |
表2 不同模型在2个数据集上的对比实验结果
Tab. 2 Results of comparison experiments of different models on two datasets
模型 | Math | MOOCRadar | ||
---|---|---|---|---|
ACC | MSE | ACC | MSE | |
Llama_8B | 0.536 | 0.464 | 0.614 | 0.391 |
Llama_8B_R | 0.545 | 0.455 | 0.627 | 0.380 |
Llama_8B_DKT | 0.551 | 0.447 | 0.645 | 0.354 |
Llama_8B_Agent | 0.554 | 0.441 | 0.653 | 0.340 |
Llama_70B | 0.549 | 0.450 | 0.646 | 0.352 |
Llama_70B_R | 0.567 | 0.441 | 0.656 | 0.338 |
Llama_70B_DKT | 0.574 | 0.426 | 0.668 | 0.327 |
Llama_70B_Agent | 0.582 | 0.422 | 0.687 | 0.312 |
Qwen | 0.511 | 0.487 | 0.581 | 0.422 |
Qwen_R | 0.525 | 0.475 | 0.599 | 0.403 |
Qwen_DKT | 0.527 | 0.474 | 0.616 | 0.390 |
Qwen_Agent | 0.534 | 0.463 | 0.622 | 0.384 |
GLM | 0.558 | 0.449 | 0.643 | 0.351 |
GLM_R | 0.574 | 0.426 | 0.688 | 0.312 |
GLM_DKT | 0.570 | 0.426 | 0.734 | 0.266 |
GLM_Agent | 0.617 | 0.385 | 0.793 | 0.207 |
GPT4o | 0.586 | 0.420 | 0.661 | 0.333 |
GPT4o_R | 0.587 | 0.419 | 0.680 | 0.320 |
GPT4o_DKT | 0.591 | 0.413 | 0.742 | 0.257 |
GPT4o_Agent | 0.595 | 0.410 | 0.769 | 0.231 |
模型 | 不同数据集上的ACC | |
---|---|---|
Math | MOOCRadar | |
Llama_8B_Agent | 0.554 | 0.653 |
Llama_8B_Agent w/o Agent | 0.551 | 0.645 |
Llama_8B_Agent w/o DKT | 0.545 | 0.627 |
Llama_8B_Agent w/o Record | 0.536 | 0.614 |
GPT4o_Agent | 0.595 | 0.769 |
GPT4o_Agent w/o Agent | 0.591 | 0.742 |
GPT4o_Agent w/o DKT | 0.587 | 0.680 |
GPT4o_Agent w/o Record | 0.586 | 0.661 |
表3 消融实验结果
Tab. 3 Results of ablation experiments
模型 | 不同数据集上的ACC | |
---|---|---|
Math | MOOCRadar | |
Llama_8B_Agent | 0.554 | 0.653 |
Llama_8B_Agent w/o Agent | 0.551 | 0.645 |
Llama_8B_Agent w/o DKT | 0.545 | 0.627 |
Llama_8B_Agent w/o Record | 0.536 | 0.614 |
GPT4o_Agent | 0.595 | 0.769 |
GPT4o_Agent w/o Agent | 0.591 | 0.742 |
GPT4o_Agent w/o DKT | 0.587 | 0.680 |
GPT4o_Agent w/o Record | 0.586 | 0.661 |
学生问题 | 模型 | 知识掌握程度 | 回答 |
---|---|---|---|
介绍下“二次根式的性质与化简”这个知识点 | GLM | — | 二次根式是指像 |
GLM_Agent | 低 | 二次根式的性质与化简这个知识点主要是关于如何操作含有根号的数学表达式。简单来说,二次根式就是形如 | |
高 | 二次根式,顾名思义,就是含有平方根的式子。它的基本形式是 |
表4 模型Case分析
Tab. 4 Model Case analysis
学生问题 | 模型 | 知识掌握程度 | 回答 |
---|---|---|---|
介绍下“二次根式的性质与化简”这个知识点 | GLM | — | 二次根式是指像 |
GLM_Agent | 低 | 二次根式的性质与化简这个知识点主要是关于如何操作含有根号的数学表达式。简单来说,二次根式就是形如 | |
高 | 二次根式,顾名思义,就是含有平方根的式子。它的基本形式是 |
1 | “人工智能+”助推教育高质量发展——第六届智能教育论坛暨第二届全国教育技术学科青年学者论坛召开 [J]. 现代教育技术, 2024, 34(10):1. |
“Artificial Intelligence Plus” boosts the high-quality development of education — the 6th Intelligent Education Forum and the second National Forum for Young Scholars in Educational Technology Discipline convene [J]. Modern Educational Technology, 2024, 34(10):1. | |
2 | 钟柏昌,刘晓凡. 生成式人工智能何以、以何生成教育[J]. 电化教育研究, 2024, 45(10):12-18. |
ZHONG B C, LIU X F. Why and how generative artificial intelligence generates education [J]. e-Education Research, 2024, 45(10):12-18. | |
3 | Google. PaLM 2 technical report [R/OL]. [2024-10-02]. . |
4 | OpenAI. GPT-4 technical report [R/OL]. [2024-08-14]. . |
5 | TOUVRON H, LAVRIL T, IZACARD G, et al. LLaMA: open and efficient foundation language models [EB/OL]. [2024-09-23].. |
6 | ZENG A, LIU X, DU Z, et al. GLM-130B: an open bilingual pre-trained model [EB/OL]. [2024-09-12]. . |
7 | OUYANG L, WU J, JIANG X, et al. Training language models to follow instructions with human feedback [C]// Proceedings of the 36th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2022: 27730-27744. |
8 | ROZIÈRE B, GEHRING J, GLOEEKLE F, et al. Code LLaMA: open foundation models for code [EB/OL]. [2024-07-09]. . |
9 | STIENNON N, OUYANG L, WU J, et al. Learning to summarize with human feedback [C]// Proceedings of the 34th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2020: 3008-3021. |
10 | 张晓文,赵雨晴,徐永坡. 基于因材施教的教育家精神历史溯源、价值判定及培育策略[J]. 教育科学论坛, 2024, 29:3-8. |
ZHANG X W, ZHAO Y Q, XU Y P. Historical traceability, value determination and cultivation strategy of educator’s spirit based on teaching according to students’ talents [J]. Education Sciences Forum, 2024, 29:3-8. | |
11 | LIU J, SHEN D, ZHANG Y, et al. What makes good in-context examples for GPT-3? [C]// Proceedings of Deep Learning Inside Out: the 3rd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures. Stroudsburg: ACL, 2022: 100-114. |
12 | ZHAO T Z, WALLACE E, FENG S, et al. Calibrate before use: improving few-shot performance of language models [C]// Proceedings of the 38th International Conference on Machine Learning. New York: JMLR.org, 2021: 12697-12706. |
13 | PIECH C, BASSEN J, HUANG J, et al. Deep knowledge tracing[C]// Proceedings of the 29th International Conference on Neural Information Processing Systems — Volume 1. Cambridge: MIT Press, 2015: 505-513. |
14 | ZHAO A, HUANG D, XU Q, et al. ExpeL: LLM agents are experiential learners [C]// Proceedings of the 38th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2024: 19632-19642. |
15 | WEIZENBAUM J. ELIZA — a computer program for the study of natural language communication between man and machine [J]. Communications of the ACM, 1966, 9(1): 36-45. |
16 | 徐春,吉双焱,马欢,等. 基于知识图谱和对话结构的问诊推荐方法[J/OL]. 计算机应用 [2024-10-27].. |
XU C, JI S Y, MA H, et al. Consultation recommendation method based on knowledge graph and dialogue structure [J/OL]. Journal of Computer Applications [2024-10-27]. . | |
17 | GODDEAU D, MENG H, POLIFRONI J, et al. A form-based dialogue manager for spoken language applications [C]// Proceeding of 4th International Conference on Spoken Language Processing — Volume 2. Piscataway: IEEE, 1996: 701-704. |
18 | 王宇,王澈,于丹. 生成式和检索式对话机器人的算法设计与实现综述[J]. 软件工程, 2021, 24(2):9-13. |
WANG Y, WANG C, YU D. Overview of algorithm design and implementation of generative and retrieving chatbots [J]. Software Engineering, 2021, 24(2):9-13. | |
19 | ZHAO T, ESKENZI M. Towards end-to-end learning for dialog state tracking and management using deep reinforcement learning[C]// Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue. Stroudsburg: ACL, 2016: 1-10. |
20 | MA W, CUI Y, SHAO N, et al. TripleNet: triple attention network for multi-turn response selection in retrieval-based chatbots [C]// Proceedings of the 23rd Conference on Computational Natural Language Learning. Stroudsburg: ACL, 2019: 737-746. |
21 | WEN T H, VANDYKE D, MRKŠIĆ N, et al. A network-based end-to-end trainable task-oriented dialogue system [C]// Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers. Stroudsburg: ACL, 2017: 438-449. |
22 | DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional Transformers for language understanding [C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Stroudsburg: ACL, 2019: 4171-4186. |
23 | AI-RFOU R, CHOE D, CONSTANT N, et al. Character-level language modeling with deeper self-attention [C]// Proceedings of the 33rd AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2019: 3159-3166. |
24 | LI Y, SU H, SHEN X, et al. DailyDialog: a manually labelled multi-turn dialogue dataset [C]// Proceedings of the 8th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Stroudsburg: ACL, 2017: 986-995. |
25 | 吴永和,姜元昊,陈圆圆,等. 大语言模型支持的多智能体:技术路径、教育应用与未来展望[J]. 开放教育研究, 2024, 30(5):63-75. |
WU Y H, JIANG Y H, CHEN Y Y, et al. Multi-agent systems supported by large language models: technical pathways, educational applications, and future prospects[J]. Open Education Research, 2024, 30(5):63-75. | |
26 | LOCK S. What is AI chatbot phenomenon ChatGPT and could it replace humans [N/OL]. The Guardian, 2022-12-05 [2024-05-13]. . |
27 | SU Y, WANG Y, CAI D, et al. PROTOTYPE-TO-STYLE: dialogue generation with style-aware editing on retrieval memory[J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2021, 29: 2152-2161. |
28 | BAEK J, CHANDRASEKARAN N, CUCERZAN S, et al. Knowledge-augmented large language models for personalized contextual query suggestion [C]// Proceedings of the ACM Web Conference 2024. New York: ACM, 2024: 3355-3366. |
29 | DU Z, QIAN Y, LIU X, et al. GLM: general language model pretraining with autoregressive blank infilling [C]// Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg: ACL, 2022: 320-335. |
30 | DAO T, FU D Y, ERMON S, et al. FlashAttention: fast and memory-efficient exact attention with IO-awareness [C]// Proceedings of the 36th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2022: 16344-16359. |
31 | SHI W, NIE Z, SHI Y. Research on the design and implementation of intelligent tutoring system based on AI big model[C]// Proceedings of the 2023 IEEE International Conference on Unmanned Systems. Piscataway: IEEE, 2023: 1-6. |
32 | 赵雅娟,孟繁军,徐行健. 在线教育学习者知识追踪综述[J]. 计算机应用, 2024, 44(6):1683-1698. |
ZHAO Y J, MENG F J, XU X J. Review of online education learner knowledge tracing [J]. Journal of Computer Applications, 2024, 44(6):1683-1698. | |
33 | 索晋贤,张丽萍,闫盛,等. 可解释的深度知识追踪方法综述 [J/OL]. 计算机应用 [2024-10-24]. . |
SUO J X, ZHANG L P, YAN S, et al. Review of interpretable deep knowledge tracing methods [J/OL]. Journal of Computer Applications [2024-10-24]. . | |
34 | SHERSTINSKY A. Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) network [J]. Physica D: Nonlinear Phenomena, 2020, 404: No.132306. |
35 | WANG F, LIU Q, CHEN E, et al. NeuralCD: a general framework for cognitive diagnosis [J]. IEEE Transactions on Knowledge and Data Engineering, 2023, 35(8): 8312-8327. |
36 | YU J, LU M, ZHONG Q, et al. MOOCRadar: a fine-grained and multi-aspect knowledge repository for improving cognitive student modeling in MOOCs [C]// Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2022: 2924-2934. |
[1] | 曹鹏, 温广琪, 杨金柱, 陈刚, 刘歆一, 季学纯. 面向测试用例生成的大模型高效微调方法[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 725-731. |
[2] | 秦小林, 古徐, 李弟诚, 徐海文. 大语言模型综述与展望[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 685-696. |
[3] | 袁成哲, 陈国华, 李丁丁, 朱源, 林荣华, 钟昊, 汤庸. ScholatGPT:面向学术社交网络的大语言模型及智能应用[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 755-764. |
[4] | 何静, 沈阳, 谢润锋. 大语言模型幻觉现象的识别与优化[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 709-714. |
[5] | 徐月梅, 叶宇齐, 何雪怡. 大语言模型的偏见挑战:识别、评估与去除[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 697-708. |
[6] | 邢长征, 梁浚锋, 金海波, 徐佳玉, 乌海荣. 强化学习和矩阵补全引导的多目标试卷生成[J]. 《计算机应用》唯一官方网站, 2025, 45(1): 48-58. |
[7] | 徐月梅, 胡玲, 赵佳艺, 杜宛泽, 王文清. 大语言模型的技术应用前景与风险挑战[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1655-1662. |
[8] | 赵雅娟, 孟繁军, 徐行健. 在线教育学习者知识追踪综述[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1683-1698. |
[9] | 李林昊, 张晓倩, 董瑶, 王旭, 董永峰. 基于个性化学习和深层次细化的知识追踪[J]. 《计算机应用》唯一官方网站, 2024, 44(10): 3039-3046. |
[10] | 姜雨杉, 张仰森. 大语言模型驱动的立场感知事实核查[J]. 《计算机应用》唯一官方网站, 2024, 44(10): 3067-3073. |
[11] | 郑浩东, 马华, 谢颖超, 唐文胜. 融合遗忘因素与记忆门的图神经网络知识追踪模型[J]. 《计算机应用》唯一官方网站, 2023, 43(9): 2747-2752. |
[12] | 董永峰, 王雅琮, 董瑶, 邓亚晗. 在线学习资源推荐综述[J]. 《计算机应用》唯一官方网站, 2023, 43(6): 1655-1663. |
[13] | 张凯, 覃正楚, 刘月, 秦心怡. 多学习行为协同的知识追踪模型[J]. 《计算机应用》唯一官方网站, 2023, 43(5): 1422-1429. |
[14] | 邵小萌, 张猛. 融合注意力机制的时间卷积知识追踪模型[J]. 《计算机应用》唯一官方网站, 2023, 43(2): 343-348. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||