《计算机应用》唯一官方网站

• •    下一篇

面向用户多行为基于强化学习的学习路径推荐模型

陈鹏宇1,田保军2,赵利畅1,房建东1   

  1. 1. 内蒙古工业大学
    2. 内蒙古工业大学金川校区信息工程学院
  • 收稿日期:2025-07-31 修回日期:2025-10-11 发布日期:2025-11-05 出版日期:2025-11-05
  • 通讯作者: 陈鹏宇

User-oriented multi-behavior reinforcement learning model for learning path recommendation

  • Received:2025-07-31 Revised:2025-10-11 Online:2025-11-05 Published:2025-11-05

摘要: 针对学习路径推荐任务中存在的交互数据稀疏和学习资源规划不合理问题,提出一种面向用户多行为基于强化学习(RL)的学习路径推荐模型(MBRL4LP)。首先,将用户行为数据分类,视为实体节点融入课程知识图谱,使用带有注意力机制的图卷积网络(GCN)捕捉多源异构特征;然后,从行为与学习资源维度设计三种数据增强策略,采用对比学习方法对增强数据进行表征学习,作为补充信息纳入强化学习;最后,充分考虑用户个体差异,通过深度Q网络构建个性化学习路径推荐模型,设计知识点与路径双重奖励函数机制引导模型收敛。在真实数据集MOOPer和MOOCCubeX四门课程上,将MBRL4LP与LPG、KTKDM、LSTMPR等学习路径推荐模型进行对比。实验结果表明,MBRL4LP相较于对比模型在MOOPer上精确率、召回率和F1分数至少提升7.42%、5.97%和6.15%;在MOOCCubeX上至少提升4.75%、6.62%和6.42%。此外,还通过参数敏感性分析实验与消融实验,验证了所提模型的有效性。

Abstract: To address issues of sparse interaction data and suboptimal learning resource planning in learning path recommendation, a reinforcement learning (RL)-based model named MBRL4LP was proposed, incorporating multi-behavior user data. First, user behavior data was categorized and embedded as entity nodes into a course knowledge graph. An attention-based Graph Convolutional Network (GCN) was used to capture multi-source heterogeneous features. Second, three data augmentation strategies were designed from both behavioral and learning resource perspectives. Contrastive learning was applied to learn representations from augmented data, which were integrated into RL as supplementary information. Finally, individual differences were fully considered. A personalized learning path recommendation framework was built using a deep Q-network, and a dual reward function—considering both knowledge concepts and path structure—was designed to guide model convergence. Experiments compared MBRL4LP with existing models (e.g., LPG, KTKDM, LSTMPR) on four real-world courses from MOOPer and MOOCCubeX datasets. Results show that MBRL4LP achieves at least 7.42%, 5.97% and 6.15% improvements in precision, recall, and F1-score on MOOPer, and at least 4.75%, 6.62% and 6.42% on MOOCCubeX, respectively. Parameter sensitivity analysis and ablation studies validate model effectiveness of MBRL4LP.

Key words: Graph Convolutional Network (, GCN)

中图分类号: