Journal of Computer Applications ›› 2022, Vol. 42 ›› Issue (3): 736-742.DOI: 10.11772/j.issn.1001-9081.2021040845

Special Issue: 人工智能 2021年中国计算机学会人工智能会议(CCFAI 2021)

• 2021 CCF Conference on Artificial Intelligence (CCFAI 2021) • Previous Articles     Next Articles

Multi-person classroom action recognition in classroom teaching videos based on deep spatiotemporal residual convolution neural network

Yongkang HUANG, Meiyu LIANG(), Xiaoxiao WANG, Zheng CHEN, Xiaowen CAO   

  1. School of Computer Science,Beijing University of Posts and Telecommunications,Beijing 100876,China
  • Received:2021-05-24 Revised:2021-07-08 Accepted:2021-07-09 Online:2021-11-09 Published:2022-03-10
  • Contact: Meiyu LIANG
  • About author:HUANG Yongkang, born in 1998, M. S. candidate. His research interests include computer vision, deep learning.
    WANG Xiaoxiao, born in 1996, M. S. candidate. Her research interests include deep learning, cross-modal retrieval.
    CHEN Zheng, born in 1996, M. S. candidate. His research interests include computer vision, deep learning.
    CAO Xiaowen, born in 1998, M. S. candidate. Her research interests include deep learning, cross-modal retrieval.
  • Supported by:
    National Natural Science Foundation of China(61877006)


黄勇康, 梁美玉(), 王笑笑, 陈徵, 曹晓雯   

  1. 北京邮电大学 计算机学院,北京 100876
  • 通讯作者: 梁美玉
  • 作者简介:黄勇康(1998—),男,江西樟树人,硕士研究生,CCF会员,主要研究方向:计算机视觉、深度学习
  • 基金资助:


In view of the problems that classroom teaching scene is obscured seriously and has numerous students, the current video action recognition algorithm is not suitable for classroom teaching scene, and there is no public dataset of student classroom action, a classroom teaching video library and a student classroom action library were constructed, and a real-time multi-person student classroom action recognition algorithm based on deep spatiotemporal residual convolution neural network was proposed. Firstly, combined with real-time object detection and tracking to get the real-time picture stream of each student, and then the deep spatiotemporal residual convolution neural network was used to learn the spatiotemporal characteristics of each student’s action, so as to realize the real-time recognition of classroom behavior for multiple students in classroom teaching scenes. In addition, an intelligent teaching evaluation model was constructed, and an intelligent teaching evaluation system based on the recognition of students’ classroom actions was designed and implemented, which can help improve the teaching quality and realize the intelligent education. By making experimental comparison and analysis on the classroom teaching video dataset, it is verified that the proposed real-time classroom action recognition model for multiple students in classroom teaching video can achieve high accuracy of 88.5%, and the intelligent teaching evaluation system based on classroom action recognition has also achieved good results in classroom teaching video dataset.

Key words: deep spatiotemporal residual convolution neural network, object detection, object tracking, multi-person classroom action recognition, intelligent teaching evaluation



关键词: 深度时空残差卷积神经网络, 目标检测, 目标跟踪, 多人课堂行为识别, 智能教学评估

CLC Number: