Driver behavior recognition based on dual-path spatiotemporal network

doi:10.11772/j.issn.1001-9081.2023050800

Journal of Computer Applications ›› 2024, Vol. 44 ›› Issue (5): 1511-1519.DOI: 10.11772/j.issn.1001-9081.2023050800

Special Issue: 第十九届中国机器学习会议(CCML 2023)

• The 19th China Conference on Machine Learning (CCML 2023) • Previous Articles Next Articles

Driver behavior recognition based on dual-path spatiotemporal network

Zhiyuan XI¹, Chao TANG¹(), Anyang TONG¹, Wenjian WANG²

^1.School of Artificial Intelligence and Big Data，Hefei University，Hefei Anhui 230601，China
^2.School of Computer and Information Technology，Shanxi University，Taiyuan Shanxi 030006，China

Received:2023-06-21 Revised:2023-07-14 Accepted:2023-07-24 Online:2023-08-01 Published:2024-05-10
Contact: Chao TANG
About author:XI Zhiyuan， born in 1995， M. S. candidate. His research interests include machine learning， computer vision.
TONG Anyang， born in 1998， M. S. candidate. His research interests include machine learning， computer vision.
WANG Wenjian， born in 1968， Ph. D.， professor. Her research interests include machine learning， computational intelligence.
Supported by:
National Natural Science Foundation of China(62076154);Natural Science Foundation of Anhui Province(2008085MF202);Graduate Academic Innovation Project of Anhui Province(2022xscx145);College Student Innovation and Entrepreneurship Training Program of Anhui Province(1602307783011602432)

基于双路时空网络的驾驶员行为识别

席治远¹, 唐超¹(), 童安炀¹, 王文剑²

^1.合肥学院人工智能与大数据学院，合肥 230601
^2.山西大学计算机与信息技术学院，太原 030006

通讯作者: 唐超
作者简介:席治远（1995—），男，安徽合肥人，硕士研究生，CCF会员，主要研究方向：机器学习、计算机视觉
童安炀（1998—），男，安徽合肥人，硕士研究生，CCF会员，主要研究方向：机器学习、计算机视觉
王文剑（1968—），女，山西太原人，教授，博士生导师，博士，CCF杰出会员，主要研究方向：机器学习、计算智能。
第一联系人：唐超（1977—），男，安徽合肥人，副教授，博士，CCF会员，主要研究方向：机器学习、计算机视觉
基金资助:
国家自然科学基金资助项目(62076154);安徽省自然科学基金资助项目(2008085MF202);安徽省研究生学术创新项目(2022xscx145);安徽省大学生创新创业训练计划项目(1602307783011602432)

Abstract

Abstract:

Dangerous driving behavior of drivers is one of the main causes of vicious traffic accidents， so identifying driver’s behavior is of great significance for engineering applications. Currently， the mainstream vision-based detection methods are to study the local spatiotemporal features of driver behavior， and less research is done on global spatial features and long-term temporal correlation features， which to a certain extent cannot be combined with the scene context information to identify dangerous driving behaviors. To solve the above problems， a driver behavior recognition method based on a dual-path spatiotemporal network was proposed， which integrated the advantages of different spatiotemporal pathways to improve the richness of behavioral features. Firstly， an improved Two-Stream convolutional Network （TSN） was used to learn the spatiotemporal information for characterization while reducing the sparsity of extracted features. Secondly， a Transformer-based serial spatiotemporal network was constructed to supplement the long-term temporal correlation information. Finally， a fusion decision was made using a dual-path spatiotemporal network to enhance the robustness of the model. Experimental results show that the proposed method achieves recognition accuracies of 99.85%， 99.94% and 98.77% on three publicly available datasets： a driver fatigue detection dataset YawDD， a driver distraction detection dataset SF-DDDD （State-Farm Distracted Driver Detection Dataset）， and a the latest driver behavior recognition dataset SynDD1， respectively； especially on SynDD1， the recognition accuracy is improved by 1.64 percentage points compared to MoviNet-A0， a recognition network by motion. Ablation experimental results confirm that the proposed method has high recognition accuracy of driver behavior.

Key words: driver behavior recognition, dual-path spatiotemporal network, Two-Stream convolutional Network (TSN), Transformer

摘要：

驾驶员危险驾驶行为是恶性交通事故发生的主要原因之一，因此识别驾驶员行为具有工程应用上的重要意义。目前，主流基于视觉的检测方法是对驾驶员行为的局部时空特征进行研究，针对全局空间特征及长时序相关性特征研究较少，这在一定程度上无法结合场景上下文信息对危险驾驶行为进行识别。为了解决上述问题，提出一种基于双路时空网络的驾驶员行为识别方法，整合不同时空通路的优点以提高行为特征丰富度。首先，使用一种改进的双流卷积神经网络（TSN）对时空信息进行表征学习，同时降低提取特征的稀疏性；其次，构建一种基于Transformer的串行时空网络补充长时序相关性信息；最后，联合双路时空网络进行融合决策，增强模型的鲁棒性。实验结果表明，所提方法在驾驶员疲劳检测数据集YawDD、驾驶员分心检测数据集SF-DDDD和最新驾驶员行为识别数据集SynDD1这3个公开数据集上分别取得99.85%、99.94%和98.77%的识别准确率，特别是在SynDD1上，与使用动作识别的网络MoviNet-A0相比识别准确率提升了1.64个百分点；消融实验结果也验证了该方法对驾驶员行为有较高的识别精度。

关键词: 驾驶员行为识别, 双路时空网络, 双流卷积神经网络, Transformer

CLC Number:

TP391

Zhiyuan XI, Chao TANG, Anyang TONG, Wenjian WANG. Driver behavior recognition based on dual-path spatiotemporal network[J]. Journal of Computer Applications, 2024, 44(5): 1511-1519.

席治远, 唐超, 童安炀, 王文剑. 基于双路时空网络的驾驶员行为识别[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1511-1519.

Figures/Tables 13

References 35

1	OLSON R L， HANOWKI R J， HICKMAN J S， et al. Driver distraction in commercial vehicle operations： FMCSA-RRT-09-042［R］. Washington， DC： United States Department of Transportation， 2009-09-01.
2	LIU F， LI X， LV T， et al. A review of driver fatigue detection： progress and prospect［C］// Proceedings of the 2019 IEEE International Conference on Consumer Electronics. Piscataway： IEEE， 2019： 1-6. 10.1109/icce.2019.8662098
3	SIKANDER G， ANWAR S. Driver fatigue detection systems： a review［J］. IEEE Transactions on Intelligent Transportation Systems， 2019， 20（6）： 2339-2352. 10.1109/tits.2018.2868499
4	KALAYCI T E， KALAYCI E G， LECHNER G， et al. Triangulated investigation of trust in automated driving： challenges and solution approaches for data integration［J］. Journal of Industrial Information Integration， 2021， 21： 100186. 10.1016/j.jii.2020.100186
5	ABTAHI S， OMIDYEGANEH M， SHIRMOHAMMADI S， et al. YawDD： a yawning detection dataset［C］// Proceedings of the 5th ACM Multimedia Systems Conference. New York： ACM， 2014： 24-28. 10.1145/2557642.2563678
6	RAHMAN M S， VENKATACHALAPATHY A， SHARMA A， et al. Synthetic distracted driving （SynDD1） dataset for analyzing distracted behaviors and various gaze zones of a driver［J］. Data in Brief， 2022， 46： 108793. 10.1016/j.dib.2022.108793
7	KASHEVNIK A， SHCHEDRIN R， KAISER C， et al. Driver distraction detection methods： a literature review and framework［J］. IEEE Access， 2021， 9： 60063-60076. 10.1109/access.2021.3073599
8	WANG J， CHAI W， VENKATACHALAPATHY A， et al. A survey on driver behavior analysis from in-vehicle cameras［J］. IEEE Transactions on Intelligent Transportation Systems， 2022， 23（8）： 10186-10209. 10.1109/tits.2021.3126231
9	MURPHY-CHUTORIAN E， TRIVEDI M M. Head pose estimation and augmented reality tracking： an integrated system and evaluation for monitoring driver awareness［J］. IEEE Transactions on Intelligent Transportation Systems， 2010， 11（2）： 300-311. 10.1109/tits.2010.2044241
10	ZHANG W， ZHANG H. Research on distracted driving identification of truck drivers based on simulated driving experiment［J］. IOP Conference Series： Earth and Environmental Science， 2021， 638： 012039. 10.1088/1755-1315/638/1/012039
11	OHN-BAR E， MARTIN S， TAWARI A， et al. Head， eye， and hand patterns for driver activity recognition［C］// Proceedings of the 2014 22nd International Conference on Pattern Recognition. Piscataway： IEEE， 2014： 660-665. 10.1109/icpr.2014.124
12	ZHANG L， TAN B， LIU T， et al. Research on recognition of dangerous driving behavior based on support vector machine［C/OL］// Proceedings of the Twelfth International Conference on Graphics and Image Processing. Bellingham： SPIE， 2021， 11720［2023-05-01］. .
13	BRAUNAGEL C， KASNECI E， STOLZMANN W， et al. Driver-activity recognition in the context of conditionally autonomous driving［C］// Proceedings of the 2015 IEEE 18th International Conference on Intelligent Transportation Systems. Piscataway： IEEE， 2015： 1652-1657. 10.1109/itsc.2015.268
14	HOU Z， OU S， XU D. Research on fatigue driving feature detection algorithms of drivers based on machine learning［J］. Systems Science & Control Engineering， 2021， 9（1）： 167-172. 10.1080/21642583.2021.1888819
15	OKON O D， MENG L. Detecting distracted driving with deep learning［C］// Proceedings of the 2017 International Conference on Interactive Collaborative Robotics. Cham： Springer， 2017： 170-179. 10.1007/978-3-319-66471-2_19
16	KOESDWIADY A， BEDAWI S M， OU C， et al. End-to-end deep learning for driver distraction recognition［C］// Proceedings of the 2017 International Conference on Image Analysis and Recognition. Cham： Springer， 2017： 11-18. 10.1007/978-3-319-59876-5_2
17	JAIN A， KOPPULA H S， SOH S， et al. Brain4Cars： car that knows before you do via sensory-fusion deep learning architecture［EB/OL］. ［2022-12-26］. . 10.1109/iccv.2015.364
18	TONG A， TANG C， WANG W. Semi-supervised action recognition from temporal augmentation using curriculum learning［J］. IEEE Transactions on Circuits and Systems for Video Technology， 2023，33（3）： 1305-1319. 10.1109/tcsvt.2022.3210271
19	REN F， TANG C， TONG A， et al. Skeleton-based human action recognition by fusing attention based three-stream convolutional neural network and SVM［J］. Multimedia Tools and Applications， 2024， 83： 6273-6295. 10.1007/s11042-023-15334-9
20	FARNEBÄCK G. Two-frame motion estimation based on polynomial expansion［C］// Proceedings of the 13th Scandinavian Conference on Image Analysis. Berlin： Springer， 2003： 363-370. 10.1007/3-540-45103-x_50
21	TAMURA M， VISHWAKARMA R， VENNELAKANTI R. Hunting group clues with Transformers for social group activity recognition ［C］// Proceedings of the 17th European Conference on Computer Vision. Cham： Springer， 2022： 19-35. 10.1007/978-3-031-19772-7_2
22	SIMONYAN K， ZISSERMAN A. Two-stream convolutional networks for action recognition in videos［C］// Proceedings of the 27th International Conference on Neural Information Processing Systems. Cambridge： MIT Press， 2014： 568-576. 10.1002/14651858.CD001941.pub3
23	SCHOLKOPF B， SMOLA A， MULLER K R. Kernel principal component analysis ［C］// Proceedings of the 1997 International Conference on Artificial Neural Networks. Berlin： Springer， 1997： 583-588. 10.1007/bfb0020217
24	YANG J， YANG J-Y， ZHANG D， et al. Feature fusion： parallel strategy vs. serial strategy［J］. Pattern Recognition， 2003， 36（6）： 1369-1381. 10.1016/s0031-3203(02)00262-5
25	HAN M， ZHANG D J， WANG Y， et al. Dual-AI： dual-path actor interaction learning for group activity recognition［C］// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2022： 2980-2989. 10.1109/cvpr52688.2022.00300
26	VASWANI A， SHAZEER N， PARMAR N， et al. Attention is all you need［C］// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2017： 6000-6010.
27	ZHANG W， SU J. Driver yawning detection based on long short term memory networks［C］// Proceedings of the 2017 IEEE Symposium Series on Computational Intelligence. Piscataway： IEEE， 2017： 1-5. 10.1109/ssci.2017.8285343
28	DONG B-T， LIN H-Y. An on-board monitoring system for driving fatigue and distraction detection［C］// Proceedings of the 2021 22nd IEEE International Conference on Industrial Technology. Piscataway： IEEE， 2021： 850-855. 10.1109/icit46573.2021.9453676
29	YOU F， GONG Y， TU H， et al. A fatigue driving detection algorithm based on facial motion information entropy［J］. Journal of Advanced Transportation， 2020， 2020： 8851485. 10.1155/2020/8851485
30	SAURAV S， MATHUR S， SANG I， et al. Yawn detection for driver’s drowsiness prediction using bi-directional LSTM with CNN features ［C］// Proceedings of the 11th Intelligent Human Computer Interaction. Cham： Springer， 2020： 189-200. 10.1007/978-3-030-44689-5_17
31	SAVAŞ B K， BECERIKLI Y. A deep learning approach to driver fatigue detection via mouth state analyses and yawning detection［J］. IOSR Journal of Computer Engineering， 2021， 23（3）： 24-30.
32	RAO X， LIN F， CHEN Z， et al. Distracted driving recognition method based on deep convolutional neural network［J］. Journal of Ambient Intelligence and Humanized Computing， 2021， 12： 193-200. 10.1007/s12652-019-01597-4
33	UNADKAT V， SAYANI P， KAPADIA H， et al. Automated system for detecting distracted driver［C］// Proceedings of the 2018 4th International Conference on Computing Communication and Automation. Piscataway： IEEE， 2018： 1-4. 10.1109/ccaa.2018.8777709
34	MASOOD S， RAI A， AGGARWAL A， et al. Detecting distraction of drivers using convolutional neural network［J］. Pattern Recognition Letters， 2020， 139： 79-85. 10.1016/j.patrec.2017.12.023
35	H-Q NGUYEN， T-B NGUYEN， TRAN T K， et al. End-to-end deep learning-based framework for driver action recognition［C］// Proceedings of the 2022 International Conference on Multimedia Analysis and Pattern Recognition. Piscataway： IEEE， 2022： 1-6. 10.1109/mapr56351.2022.9924944

真实情况	预测情况
真实情况	P	N
P	TP	FN
N	FP	TN

真实情况	预测情况
真实情况	P	N
P	TP	FN
N	FP	TN

环境	参数	环境	参数
操作系统	Windows10	PyTorch	1.8.1
显卡	RTX3060	CUDA	11.3.1
Python	3.9

环境	参数	环境	参数
操作系统	Windows10	PyTorch	1.8.1
显卡	RTX3060	CUDA	11.3.1
Python	3.9

方法	YawDD				SF-DDDD				SynDD1
方法	Acc	P	R	F₁	Acc	P	R	F₁	Acc	P	R	F₁
CNN_temporal	95.50	95.47	95.47	95.47	84.89	84.83	84.83	84.83	86.20	86.21	86.21	86.21
CNN_spatio	99.64	99.65	99.65	99.65	99.51	99.51	99.50	99.50	97.20	97.21	97.18	97.20
TSN（a）	99.63	99.65	99.64	99.64	98.72	98.68	98.70	98.69	97.17	97.18	97.17	97.17
TSN（b）	99.70	99.68	99.71	99.70	99.46	99.44	99.41	99.42	97.22	97.23	97.23	97.23
TSN（b）-PCA	99.56	99.60	99.56	99.58	99.33	99.34	99.37	99.35	97.25	97.23	97.24	97.24
TSN（b）-KPCA	99.72	99.71	99.73	99.72	99.55	99.57	99.56	99.56	98.43	98.41	98.40	98.40
DPST（a）	99.81	99.80	99.80	99.80	99.71	99.72	99.73	99.72	98.51	98.49	98.50	98.49
DPST（b）	99.85	99.83	99.85	99.84	99.94	99.94	99.95	99.95	98.77	98.77	98.77	98.77

Driver behavior recognition based on dual-path spatiotemporal network

基于双路时空网络的驾驶员行为识别

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 13

References 35

Related Articles 15

Recommended Articles

Metrics

方法	YawDD	SF-DDDD	SynDD1	方法	YawDD	SF-DDDD	SynDD1
LSTM^［27］	88.60	—	—	PCA+CNN^［32］	—	97.31	—
EAR+CNN^［28］	91.00	97.50	—	AlexNet^［33］	—	99.49	—
多特征融合SVM^［29］	94.32	—	—	VGG16^［34］	—	99.57	—
CNN+Bi-LSTM^［30］	96.48	—	—	MoviNet-A0^［35］	—	—	97.13
改进CNN^［31］	99.35	—	—	本文方法	99.85	99.94	98.77

[1]	Yunchuan HUANG, Yongquan JIANG, Juntao HUANG, Yan YANG. Molecular toxicity prediction based on meta graph isomorphism network [J]. Journal of Computer Applications, 2024, 44(9): 2964-2969.
[2]	Xin YANG, Xueni CHEN, Chunjiang WU, Shijie ZHOU. Short-term traffic flow prediction of urban highway based on variant residual model and Transformer [J]. Journal of Computer Applications, 2024, 44(9): 2947-2951.
[3]	Jieru JIA, Jianchao YANG, Shuorui ZHANG, Tao YAN, Bin CHEN. Unsupervised person re-identification based on self-distilled vision Transformer [J]. Journal of Computer Applications, 2024, 44(9): 2893-2902.
[4]	Jinjin LI, Guoming SANG, Yijia ZHANG. Multi-domain fake news detection model enhanced by APK-CNN and Transformer [J]. Journal of Computer Applications, 2024, 44(9): 2674-2682.
[5]	Jiepo FANG, Chongben TAO. Hybrid internet of vehicles intrusion detection system for zero-day attacks [J]. Journal of Computer Applications, 2024, 44(9): 2763-2769.
[6]	Liehong REN, Lyuwen HUANG, Xu TIAN, Fei DUAN. Multivariate long-term series forecasting method with DFT-based frequency-sensitive dual-branch Transformer [J]. Journal of Computer Applications, 2024, 44(9): 2739-2746.
[7]	Yuwei DING, Hongbo SHI, Jie LI, Min LIANG. Image denoising network based on local and global feature decoupling [J]. Journal of Computer Applications, 2024, 44(8): 2571-2579.
[8]	Kaili DENG, Weibo WEI, Zhenkuan PAN. Industrial defect detection method with improved masked autoencoder [J]. Journal of Computer Applications, 2024, 44(8): 2595-2603.
[9]	Fan YANG, Yao ZOU, Mingzhi ZHU, Zhenwei MA, Dawei CHENG, Changjun JIANG. Credit card fraud detection model based on graph attention Transformation neural network [J]. Journal of Computer Applications, 2024, 44(8): 2634-2642.
[10]	Dahai LI, Zhonghua WANG, Zhendong WANG. Dual-branch low-light image enhancement network combining spatial and frequency domain information [J]. Journal of Computer Applications, 2024, 44(7): 2175-2182.
[11]	Xiting LYU, Jinghua ZHAO, Haiying RONG, Jiale ZHAO. Information diffusion prediction model based on Transformer and relational graph convolutional network [J]. Journal of Computer Applications, 2024, 44(6): 1760-1766.
[12]	Xun YAO, Zhongzheng QIN, Jie YANG. Generative label adversarial text classification model [J]. Journal of Computer Applications, 2024, 44(6): 1781-1785.
[13]	Mengyuan HUANG, Kan CHANG, Mingyang LING, Xinjie WEI, Tuanfa QIN. Progressive enhancement algorithm for low-light images based on layer guidance [J]. Journal of Computer Applications, 2024, 44(6): 1911-1919.
[14]	Junfeng SHEN, Xingchen ZHOU, Can TANG. Dual-channel sentiment analysis model based on improved prompt learning method [J]. Journal of Computer Applications, 2024, 44(6): 1796-1806.
[15]	Shibin LI, Jun GONG, Shengjun TANG. Semi-supervised heterophilic graph representation learning model based on Graph Transformer [J]. Journal of Computer Applications, 2024, 44(6): 1816-1823.