Journal of Computer Applications ›› 2024, Vol. 44 ›› Issue (5): 1511-1519.DOI: 10.11772/j.issn.1001-9081.2023050800

Special Issue: 第十九届中国机器学习会议(CCML 2023)

• The 19th China Conference on Machine Learning (CCML 2023) • Previous Articles     Next Articles

Driver behavior recognition based on dual-path spatiotemporal network

Zhiyuan XI1, Chao TANG1(), Anyang TONG1, Wenjian WANG2   

  1. 1.School of Artificial Intelligence and Big Data,Hefei University,Hefei Anhui 230601,China
    2.School of Computer and Information Technology,Shanxi University,Taiyuan Shanxi 030006,China
  • Received:2023-06-21 Revised:2023-07-14 Accepted:2023-07-24 Online:2023-08-01 Published:2024-05-10
  • Contact: Chao TANG
  • About author:XI Zhiyuan, born in 1995, M. S. candidate. His research interests include machine learning, computer vision.
    TONG Anyang, born in 1998, M. S. candidate. His research interests include machine learning, computer vision.
    WANG Wenjian, born in 1968, Ph. D., professor. Her research interests include machine learning, computational intelligence.
  • Supported by:
    National Natural Science Foundation of China(62076154);Natural Science Foundation of Anhui Province(2008085MF202);Graduate Academic Innovation Project of Anhui Province(2022xscx145);College Student Innovation and Entrepreneurship Training Program of Anhui Province(1602307783011602432)

基于双路时空网络的驾驶员行为识别

席治远1, 唐超1(), 童安炀1, 王文剑2   

  1. 1.合肥学院 人工智能与大数据学院,合肥 230601
    2.山西大学 计算机与信息技术学院,太原 030006
  • 通讯作者: 唐超
  • 作者简介:席治远(1995—),男,安徽合肥人,硕士研究生,CCF会员,主要研究方向:机器学习、计算机视觉
    童安炀(1998—),男,安徽合肥人,硕士研究生,CCF会员,主要研究方向:机器学习、计算机视觉
    王文剑(1968—),女,山西太原人,教授,博士生导师,博士,CCF杰出会员,主要研究方向:机器学习、计算智能。
    第一联系人:唐超(1977—),男,安徽合肥人,副教授,博士,CCF会员,主要研究方向:机器学习、计算机视觉
  • 基金资助:
    国家自然科学基金资助项目(62076154);安徽省自然科学基金资助项目(2008085MF202);安徽省研究生学术创新项目(2022xscx145);安徽省大学生创新创业训练计划项目(1602307783011602432)

Abstract:

Dangerous driving behavior of drivers is one of the main causes of vicious traffic accidents, so identifying driver’s behavior is of great significance for engineering applications. Currently, the mainstream vision-based detection methods are to study the local spatiotemporal features of driver behavior, and less research is done on global spatial features and long-term temporal correlation features, which to a certain extent cannot be combined with the scene context information to identify dangerous driving behaviors. To solve the above problems, a driver behavior recognition method based on a dual-path spatiotemporal network was proposed, which integrated the advantages of different spatiotemporal pathways to improve the richness of behavioral features. Firstly, an improved Two-Stream convolutional Network (TSN) was used to learn the spatiotemporal information for characterization while reducing the sparsity of extracted features. Secondly, a Transformer-based serial spatiotemporal network was constructed to supplement the long-term temporal correlation information. Finally, a fusion decision was made using a dual-path spatiotemporal network to enhance the robustness of the model. Experimental results show that the proposed method achieves recognition accuracies of 99.85%, 99.94% and 98.77% on three publicly available datasets: a driver fatigue detection dataset YawDD, a driver distraction detection dataset SF-DDDD (State-Farm Distracted Driver Detection Dataset), and a the latest driver behavior recognition dataset SynDD1, respectively; especially on SynDD1, the recognition accuracy is improved by 1.64 percentage points compared to MoviNet-A0, a recognition network by motion. Ablation experimental results confirm that the proposed method has high recognition accuracy of driver behavior.

Key words: driver behavior recognition, dual-path spatiotemporal network, Two-Stream convolutional Network (TSN), Transformer

摘要:

驾驶员危险驾驶行为是恶性交通事故发生的主要原因之一,因此识别驾驶员行为具有工程应用上的重要意义。目前,主流基于视觉的检测方法是对驾驶员行为的局部时空特征进行研究,针对全局空间特征及长时序相关性特征研究较少,这在一定程度上无法结合场景上下文信息对危险驾驶行为进行识别。为了解决上述问题,提出一种基于双路时空网络的驾驶员行为识别方法,整合不同时空通路的优点以提高行为特征丰富度。首先,使用一种改进的双流卷积神经网络(TSN)对时空信息进行表征学习,同时降低提取特征的稀疏性;其次,构建一种基于Transformer的串行时空网络补充长时序相关性信息;最后,联合双路时空网络进行融合决策,增强模型的鲁棒性。实验结果表明,所提方法在驾驶员疲劳检测数据集YawDD、驾驶员分心检测数据集SF-DDDD和最新驾驶员行为识别数据集SynDD1这3个公开数据集上分别取得99.85%、99.94%和98.77%的识别准确率,特别是在SynDD1上,与使用动作识别的网络MoviNet-A0相比识别准确率提升了1.64个百分点;消融实验结果也验证了该方法对驾驶员行为有较高的识别精度。

关键词: 驾驶员行为识别, 双路时空网络, 双流卷积神经网络, Transformer

CLC Number: