《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (3): 736-742.DOI: 10.11772/j.issn.1001-9081.2021040845

• 2021年中国计算机学会人工智能会议(CCFAI 2021) • 上一篇    

基于深度时空残差卷积神经网络的课堂教学视频中多人课堂行为识别

黄勇康, 梁美玉(), 王笑笑, 陈徵, 曹晓雯   

  1. 北京邮电大学 计算机学院,北京 100876
  • 收稿日期:2021-05-24 修回日期:2021-07-08 接受日期:2021-07-09 发布日期:2021-11-09 出版日期:2022-03-10
  • 通讯作者: 梁美玉
  • 作者简介:黄勇康(1998—),男,江西樟树人,硕士研究生,CCF会员,主要研究方向:计算机视觉、深度学习
    王笑笑(1996—),女,山西长治人,硕士研究生,主要研究方向:深度学习、跨模态搜索
    陈徵(1996—),男,甘肃金昌人,硕士研究生,CCF会员,主要研究方向:计算机视觉、深度学习
    曹晓雯(1998—),女,山西吕梁人,硕士研究生,主要研究方向:深度学习、跨模态搜索。
  • 基金资助:
    国家自然科学基金资助项目(61877006)

Multi-person classroom action recognition in classroom teaching videos based on deep spatiotemporal residual convolution neural network

Yongkang HUANG, Meiyu LIANG(), Xiaoxiao WANG, Zheng CHEN, Xiaowen CAO   

  1. School of Computer Science,Beijing University of Posts and Telecommunications,Beijing 100876,China
  • Received:2021-05-24 Revised:2021-07-08 Accepted:2021-07-09 Online:2021-11-09 Published:2022-03-10
  • Contact: Meiyu LIANG
  • About author:HUANG Yongkang, born in 1998, M. S. candidate. His research interests include computer vision, deep learning.
    WANG Xiaoxiao, born in 1996, M. S. candidate. Her research interests include deep learning, cross-modal retrieval.
    CHEN Zheng, born in 1996, M. S. candidate. His research interests include computer vision, deep learning.
    CAO Xiaowen, born in 1998, M. S. candidate. Her research interests include deep learning, cross-modal retrieval.
  • Supported by:
    National Natural Science Foundation of China(61877006)

摘要:

针对课堂教学场景遮挡严重、学生众多,以及目前的视频行为识别算法并不适用于课堂教学场景,且尚无学生课堂行为的公开数据集的问题,构建了课堂教学视频库以及学生课堂行为库,提出了基于深度时空残差卷积神经网络的课堂教学视频中实时多人学生课堂行为识别算法。首先,结合实时目标检测和跟踪,得到每个学生的实时图片流;接着,利用深度时空残差卷积神经网络对每个学生行为的时空特征进行学习,从而实现课堂教学场景中面向多学生目标的课堂行为的实时识别;此外,构建了智能教学评估模型,并设计实现了基于学生课堂行为识别的智能教学评估系统,助力教学质量的提升,以实现智慧教育。通过在课堂教学视频数据集上进行实验对比与分析,验证了提出的课堂教学视频中实时多人学生课堂行为识别模型能够达到88.5%的准确率,且所构建的基于课堂行为识别的智能教学评估系统在课堂教学视频数据集上也已取得较好的运行效果。

关键词: 深度时空残差卷积神经网络, 目标检测, 目标跟踪, 多人课堂行为识别, 智能教学评估

Abstract:

In view of the problems that classroom teaching scene is obscured seriously and has numerous students, the current video action recognition algorithm is not suitable for classroom teaching scene, and there is no public dataset of student classroom action, a classroom teaching video library and a student classroom action library were constructed, and a real-time multi-person student classroom action recognition algorithm based on deep spatiotemporal residual convolution neural network was proposed. Firstly, combined with real-time object detection and tracking to get the real-time picture stream of each student, and then the deep spatiotemporal residual convolution neural network was used to learn the spatiotemporal characteristics of each student’s action, so as to realize the real-time recognition of classroom behavior for multiple students in classroom teaching scenes. In addition, an intelligent teaching evaluation model was constructed, and an intelligent teaching evaluation system based on the recognition of students’ classroom actions was designed and implemented, which can help improve the teaching quality and realize the intelligent education. By making experimental comparison and analysis on the classroom teaching video dataset, it is verified that the proposed real-time classroom action recognition model for multiple students in classroom teaching video can achieve high accuracy of 88.5%, and the intelligent teaching evaluation system based on classroom action recognition has also achieved good results in classroom teaching video dataset.

Key words: deep spatiotemporal residual convolution neural network, object detection, object tracking, multi-person classroom action recognition, intelligent teaching evaluation

中图分类号: