融合知识图谱和协同过滤的学生成绩预测方法

doi:10.11772/j.issn.1001-9081.2019071222

《计算机应用》唯一官方网站 ›› 2020, Vol. 40 ›› Issue (2): 595-601.DOI: 10.11772/j.issn.1001-9081.2019071222

• 应用前沿、交叉与综合 • 上一篇下一篇

融合知识图谱和协同过滤的学生成绩预测方法

陈曦¹, 梅广¹, 张金金², 许维胜¹^,³()

^1.同济大学电子与信息工程学院，上海 201804
^2.同济大学教育技术与计算中心，上海 200092
^3.同济大学信息化办公室，上海 200092

收稿日期:2019-07-15 修回日期:2019-09-06 接受日期:2019-09-06 发布日期:2019-10-25 出版日期:2020-02-10
通讯作者: 许维胜
作者简介:陈曦（1995—），女，安徽芜湖人，硕士研究生，主要研究方向：数据挖掘、自然语言处理
梅广（1989—），男，安徽天长人，博士研究生，主要研究方向：教育信息化、数据挖掘、人工智能
张金金（1994—），女，山东临沂人，主要研究方向：智慧教学环境设计、教育信息化；
基金资助:
国家自然科学基金资助项目(71540022)

Student grade prediction method based on knowledge graph and collaborative filtering

Xi CHEN¹, Guang MEI¹, Jinjin ZHANG², Weisheng XU¹^,³()

^1.College of Electronic and Information Engineering，Tongji University，Shanghai 201804，China
^2.Education Technology and Computing Center，Tongji University，Shanghai 200092，China
^3.Informatics Office，Tongji University，Shanghai 200092，China

Received:2019-07-15 Revised:2019-09-06 Accepted:2019-09-06 Online:2019-10-25 Published:2020-02-10
Contact: Weisheng XU
About author:CHEN Xi， born in 1995， M. S. candidate. Her research interests include data mining， natural language processing.
MEI Guang， born in 1989， Ph. D. candidate. His research interests include education informatization， data mining， artificial intelligence.
ZHANG Jinjin， born in 1994. Her research interests include design of intelligent teaching environment， education informatization.
Supported by:
the National Natural Science Foundation of China(71540022)

摘要/Abstract

摘要：

针对高等教育本科教学场景中的学生成绩预测问题，提出了一种基于课程知识图谱（KG）的预测算法。首先，构造一个表示课程信息的课程知识图谱。然后，分别使用基于邻节点的方法和基于知识图谱表示学习的方法基于知识图谱计算课程在知识层面的相似度，并将课程的知识相似度集成到传统的成绩预测框架协同过滤（CF）中。最后，通过实验对比了融合知识图谱的算法和常见成绩预测算法在不同数据稀疏度场景下的性能。实验结果显示，在数据稀疏场景下，基于邻节点的算法和传统协同过滤算法相比，均方根误差（RMSE）下降约11%，平均绝对误差（MAE）下降约9%；基于图谱表示学习的算法与协同过滤算法相比RMSE下降17.55%，MAE下降11.40%。实验结果表明，运用知识图谱的协同过滤算法可使预测误差显著下降，验证了知识图谱可以作为历史数据缺乏场景下的信息补足，从而帮助协同过滤获得更好的预测效果。

关键词: 协同过滤, 知识图谱, 成绩预测, 教育数据挖掘, 智慧校园

Abstract:

Focusing on the prediction of student grade in the undergraduate teaching of higher education， a prediction algorithm based on course Knowledge Graph （KG） was proposed. Firstly， a course KG representing course information was constructed. Then， the neighbor-based methods and the KG representation learning-based methods were used to calculate the similarity of the courses on the knowledge level based on the KG， and those knowledge similarities among courses were integrated into the traditional grade prediction framework Collaborative Filtering （CF）. Finally， the performance of the algorithm with fusing KG and the common prediction algorithm in different data sparsities were compared in experiments. Experimental results show that in the data sparse scenario， compared with the traditional CF algorithm， the neighbor-based algorithm has the Root Mean Square Error （RMSE） reduced by about 11% and the Mean Absolute Error （MAE） reduced by about 9%； and compared with the traditional CF algorithm， KG representation learning-based algorithm has the RMSE reduced by about 17.55% and the MAE reduced by about 11.40%. Experimental results indicate that the CF algorithm using KG can significantly reduce the prediction error， which proves that the KG can be used as information supplement in the lack of historical data， thus helping CF to obtain better prediction results.

Key words: Collaborative Filtering (CF), Knowledge Graph (KG), grade prediction, Educational Data Mining (EDM), intelligent campus

中图分类号:

TP391.1

陈曦, 梅广, 张金金, 许维胜. 融合知识图谱和协同过滤的学生成绩预测方法[J]. 计算机应用, 2020, 40(2): 595-601.

Xi CHEN, Guang MEI, Jinjin ZHANG, Weisheng XU. Student grade prediction method based on knowledge graph and collaborative filtering[J]. Journal of Computer Applications, 2020, 40(2): 595-601.

图/表 10

参考文献 32

1	MCFARLAND J， HUSSAR B， ZHANG J， et al. The condition of education 2019［EB/OL］. ［2019-05-01］. ？pubid=2019144.
2	GRAYSON A， MILLER H， CLARKE D D. Identifying barriers to help-seeking： a qualitative analysis of students’ preparedness to seek help from tutors［J］. British Journal of Guidance and Counselling， 1998， 26（2）： 237-253. 10.1080/03069889808259704
3	ROMERO C， VENTURA S. Educational data mining： a survey from 1995 to 2005［J］. Expert Systems with Applications， 2007， 33（1）： 135-146. 10.1016/j.eswa.2006.04.005
4	CASTRO F， VELLIDO A， NEBOT À， et al. Applying data mining techniques to e-learning problems［M］// JAIN L C， TEDMAN R A， TEDMAN D K. Evolution of Teaching and Learning Paradigms in Intelligent Environment， SCI62. Berlin： Springer， 2007： 183-221.
5	MEIER Y， XU J， ATAN O， et al. Predicting grades［J］. IEEE Transactions on Signal Processing， 2016， 64（4）： 959-972. 10.1109/tsp.2015.2496278
6	MÁRQUEZ-VERA C， ROMERO C， VENTURA S. Predicting school failure using data mining［C］// Proceedings of the 4th International Conference on Educational Data Mining. Eindhoven， Netherlands： International Educational Data Mining Society， 2011：271-276.
7	刘志妩. 基于决策树算法的学生成绩的预测分析［J］. 计算机应用与软件， 2012， 29（11）：312-314， 330.
	LIU Z W. Forecast and analysis of students’ marks based on decision tree algorithm［J］. Computer Applications and Software， 2012， 29（11）： 312-314， 330.
8	BURMAN I， SOM S. Predicting students academic performance using support vector machine［C］// Proceedings of the 2019 Amity International Conference on Artificial Intelligence. Piscataway： IEEE， 2019： 756-759. 10.1109/aicai.2019.8701260
9	CAZAREZ R L U， MARTIN C L. Neural networks for predicting student performance in online education［J］. IEEE Latin America Transactions， 2018， 16（7）： 2053-2060. 10.1109/tla.2018.8447376
10	黄建明. 贝叶斯网络在学生成绩预测中的应用［J］. 计算机科学， 2012， 39（S3）：280-282. 10.3969/j.issn.1002-137X.2012.z3.075
	HUANG J M. Application of Bayesian network to predicting students’ achievement［J］. Computer Science， 2012， 39（11A）： 280-282. 10.3969/j.issn.1002-137X.2012.z3.075
11	BYDŽOVSKÁ H. Are collaborative filtering methods suitable for student performance prediction？［C］// Proceedings of the 2015 Portuguese Conference on Artificial Intelligence， LNCS9273. Cham： Springer， 2015： 425-430.
12	BYDŽOVSKÁ H. A comparative analysis of techniques for predicting student performance［C］// Proceedings of the 2016 International Conference on Educational Data Mining. Raleigh， NC： International Educational Data Mining Society， 2016： 306-311.
13	HUANG L， WANG C， CHAO H， et al. A score prediction approach for optional course recommendation via cross-user-domain collaborative filtering［J］. IEEE Access， 2019， 7： 19550-19563. 10.1109/access.2019.2897979
14	SWEENEY M， RANGWALA H， LESTER J， et al. Next-term student performance prediction： a recommender systems approach［EB/OL］. ［2019-05-01］. . 10.1109/bigdata.2015.7363847
15	ALMUTAIRI F M， SIDIROPOULOS N D， KARYPIS G. Context-aware recommendation-based learning analytics using tensor and coupled matrix factorization［J］. IEEE Journal of Selected Topics in Signal Processing， 2017， 11（5）： 729-741. 10.1109/jstsp.2017.2705581
16	ELBADRAWY A， POLYZOU A， REN Z， et al. Predicting student performance using personalized analytics［J］. Computer， 2016， 49（4）： 61-69. 10.1109/mc.2016.119
17	XU J， MOON K H， SCHAAR M VAN DER. A machine learning approach for tracking and predicting student performance in degree programs［J］. IEEE Journal of Selected Topics in Signal Processing， 2017， 11（5）： 742-753. 10.1109/jstsp.2017.2692560
18	MIHALCEA R， TARAU P. TextRank： bringing order into text［C］// Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing. Stroudsburg， PA： Association for Computational Linguistics， 2004： 404-411. 10.3115/1220355.1220517
19	ADAMIC L A， ADAR E. Friends and neighbors on the Web［J］. Social Networks， 2003， 25（3）： 211-230. 10.1016/s0378-8733(03)00009-1
20	JEONG H， NÉDA Z， BARABÁSI A L. Measuring preferential attachment in evolving networks［J］. Europhysics Letters， 2003， 61（4）： 567-572. 10.1209/epl/i2003-00166-9
21	ZHOU T， LÜ L， ZHANG Y. Predicting missing links via local information［J］. The European Physical Journal B， 2009， 71（4）： 623-630. 10.1140/epjb/e2009-00335-8
22	BORDES A， USUNIER N， GARCIA-DURÁN A， et al. Translating embeddings for modeling multi-relational data［C］// Proceedings of the 2013 Conference on Neural Information Processing Systems. New York： ACM， 2013： 2787-2795. 10.1007/978-3-662-44848-9_28
23	YANG B， YIH W T， HE X， et al. Embedding entities and relations for learning and inference in knowledge bases［EB/OL］. ［2019-05-01］. .
24	YANG Y， LIU H， CARBONELL J， et al. Concept graph learning from educational data［C］// Proceedings of the 8th ACM International Conference on Web Search and Data Mining. New York： ACM， 2015： 159-168. 10.1145/2684822.2685292
25	LARRAÑAGA M， CONDE A， CALVO I， et al. Automatic generation of the domain module from electronic textbooks： method and validation［J］. IEEE Transactions on Knowledge and Data Engineering， 2014， 26（1）： 69-82. 10.1109/tkde.2013.36
26	SALTON G， BUCKLEY C. Term-weighting approaches in automatic text retrieval［J］. Information Processing and Management， 1988， 24（5）： 513-523. 10.1016/0306-4573(88)90021-0
27	侯俊萌. 基于MOOC的高等教育知识图谱的构建［D］. 北京：北京邮电大学， 2017： 1-65. 10.7763/ijiet.2016.v6.672
	HOU J M. Construction of higher education knowledge map based on MOOC［D］. Beijing： Beijing University of Posts and Telecommunications， 2017： 1-65. 10.7763/ijiet.2016.v6.672
28	CHEN P， LU Y， ZHENG V W， et al. KnowEdu： a system to construct knowledge graph for education［J］. IEEE Access， 2018， 6： 31553-31563. 10.1109/access.2018.2839607
29	WANG S， LIANG C， WU Z， et al. Concept hierarchy extraction from textbooks［C］// Proceedings of the 2015 ACM Symposium on Document Engineering. New York： ACM， 2015： 147-156. 10.1145/2682571.2797062
30	TROUILLON T， WELBL J， RIEDEL S， et al. Complex embeddings for simple link prediction［C］// Proceedings of the 33rd International Conference on Machine Learning. New York： International Machine Learning Society， 2016： 2071-2080.
31	PAGE L， BRIN S， MOTWANI R， et al. The PageRank citation ranking： bringing order to the Web［R］. Brisbane， Australia： Stanford InfoLab， 1999.
32	KOREN Y， BELL R， VOLINSKY C. Matrix factorization techniques for recommender systems［J］. Computer， 2009， 42（8）： 30-37. 10.1109/mc.2009.263

实体名称	数量	实体名称	数量
课程	5 378	参考书	2 063
院系	601	知识点	7 779
教材	2 187	教学模式	3

实体名称	数量	实体名称	数量
课程	5 378	参考书	2 063
院系	601	知识点	7 779
教材	2 187	教学模式	3

关系名称	数量	关系名称	数量
院系-OFFER-课程	5 378	课程-TAKE-教材	2 581
课程-COVER-知识点	58 939	课程-REFER-参考书	2 063
课程-UTILIZE-教学模式	336

关系名称	数量	关系名称	数量
院系-OFFER-课程	5 378	课程-TAKE-教材	2 581
课程-COVER-知识点	58 939	课程-REFER-参考书	2 063
课程-UTILIZE-教学模式	336

场景序号	算法名称	RMSE	RMSE下降率/%	MAE	MAE下降率/%
1	Normal Prediction	1.175 1		0.931 7
	MF	0.889 8		0.678 8
	Item-Based CF	0.821 5		0.415 9
	Same Community	0.779 5	5.11	0.397 9	4.33
	Adamic Adar	0.729 3	11.22	0.377 3	9.28
	Common Neighbor	0.729 0	11.26	0.377 3	9.28
	Prefer Attachment	0.857 6	-4.39	0.477 1	-14.72
	Resource Allocation	0.729 8	11.16	0.378 1	9.09
	Total Neighbors	0.845 9	-2.97	0.470 2	-13.06
2	Normal Prediction	0.978 2		0.821 8
	MF	0.737 8		0.400 2
	Item-Based CF	0.688 4		0.351 5
	Same Community	0.651 9	5.30	0.333 1	5.23
	Adamic Adar	0.626 6	8.98	0.318 6	9.36
	Common Neighbor	0.625 9	9.08	0.318 3	9.45
	Prefer Attachment	0.730 0	-6.04	0.397 7	-13.14
	Resource Allocation	0.629 9	8.50	0.321 4	8.56
	Total Neighbors	0.720 5	-4.66	0.392 6	-11.69
3	Normal Prediction	0.887 3		0.790 6
	MF	0.681 8		0.417 6
	Item-Based CF	0.549 7		0.341 2
	Same Community	0.584 2	-6.28	0.384 3	-12.63
	Adamic Adar	0.529 6	3.66	0.331 4	2.87
	Common Neighbor	0.531 6	3.29	0.332 1	2.67
	Prefer Attachment	0.601 8	-9.48	0.367 6	-7.74
	Resource Allocation	0.593 5	-7.97	0.360 6	-5.69
	Total Neighbors	0.551 8	-0.38	0.339 6	0.47

融合知识图谱和协同过滤的学生成绩预测方法

Student grade prediction method based on knowledge graph and collaborative filtering

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 10

参考文献 32

相关文章 15

编辑推荐

Metrics

方法	MRR		Hit@10/%
方法	训练集	测试集	训练集	测试集
TransE	0.196 2	0.146 2	90.00	69.80
DistMult	0.754 1	0.499 2	98.00	84.65

[1]	薛桂香, 王辉, 周卫峰, 刘瑜, 李岩. 基于知识图谱和时空扩散图卷积网络的港口交通流量预测[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2952-2957.
[2]	武杰, 张安思, 吴茂东, 张仪宗, 王从宝. 知识图谱在装备故障诊断领域的研究与应用综述[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2651-2659.
[3]	杨兴耀, 陈羽, 于炯, 张祖莲, 陈嘉颖, 王东晓. 结合自我特征和对比学习的推荐模型[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2704-2710.
[4]	赵宇博, 张丽萍, 闫盛, 侯敏, 高茂. 基于改进分段卷积神经网络和知识蒸馏的学科知识实体间关系抽取[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2421-2429.
[5]	于右任, 张仰森, 蒋玉茹, 黄改娟. 融合多粒度语言知识与层级信息的中文命名实体识别模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1706-1712.
[6]	李健京, 李贯峰, 秦飞舟, 李卫军. 基于不确定知识图谱嵌入的多关系近似推理模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1751-1759.
[7]	赵晓焱, 匡燕, 王梦含, 袁培燕. 基于知识图谱的端到端内容共享机制[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 995-1001.
[8]	郭洁, 林佳瑜, 梁祖红, 罗孝波, 孙海涛. 基于知识感知和跨层次对比学习的推荐方法[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1121-1127.
[9]	王利琴, 张特, 许智宏, 董永峰, 杨国伟. 融合实体语义及结构信息的知识图谱推理[J]. 《计算机应用》唯一官方网站, 2024, 44(11): 3371-3378.
[10]	周北京, 王海荣, 王怡梦, 张丽丝, 马赫. 图谱嵌入传播的推荐方法[J]. 《计算机应用》唯一官方网站, 2024, 44(10): 3252-3259.
[11]	蒋汶娟, 过弋, 付娇娇. 融合图注意力的复杂时序知识图谱推理问答模型[J]. 《计算机应用》唯一官方网站, 2024, 44(10): 3047-3057.
[12]	王红斌, 房晓, 江虹. 融入三维语义特征的常识推理问答方法[J]. 《计算机应用》唯一官方网站, 2024, 44(1): 138-144.
[13]	王春雷, 王肖, 刘凯. 多模态知识图谱表示学习综述[J]. 《计算机应用》唯一官方网站, 2024, 44(1): 1-15.
[14]	潘润超, 虞启山, 熊泓霏, 刘智慧. 基于深度图神经网络的协同推荐算法[J]. 《计算机应用》唯一官方网站, 2023, 43(9): 2741-2746.
[15]	郑浩东, 马华, 谢颖超, 唐文胜. 融合遗忘因素与记忆门的图神经网络知识追踪模型[J]. 《计算机应用》唯一官方网站, 2023, 43(9): 2747-2752.