基于双路时空网络的驾驶员行为识别

doi:10.11772/j.issn.1001-9081.2023050800

《计算机应用》唯一官方网站 ›› 2024, Vol. 44 ›› Issue (5): 1511-1519.DOI: 10.11772/j.issn.1001-9081.2023050800

所属专题：第十九届中国机器学习会议(CCML 2023)

• 第十九届中国机器学习会议(CCML 2023) • 上一篇下一篇

基于双路时空网络的驾驶员行为识别

席治远¹, 唐超¹(), 童安炀¹, 王文剑²

^1.合肥学院人工智能与大数据学院，合肥 230601
^2.山西大学计算机与信息技术学院，太原 030006

收稿日期:2023-06-21 修回日期:2023-07-14 接受日期:2023-07-24 发布日期:2023-08-01 出版日期:2024-05-10
通讯作者: 唐超
作者简介:席治远（1995—），男，安徽合肥人，硕士研究生，CCF会员，主要研究方向：机器学习、计算机视觉
童安炀（1998—），男，安徽合肥人，硕士研究生，CCF会员，主要研究方向：机器学习、计算机视觉
王文剑（1968—），女，山西太原人，教授，博士生导师，博士，CCF杰出会员，主要研究方向：机器学习、计算智能。
第一联系人：唐超（1977—），男，安徽合肥人，副教授，博士，CCF会员，主要研究方向：机器学习、计算机视觉
基金资助:
国家自然科学基金资助项目(62076154);安徽省自然科学基金资助项目(2008085MF202);安徽省研究生学术创新项目(2022xscx145);安徽省大学生创新创业训练计划项目(1602307783011602432)

Driver behavior recognition based on dual-path spatiotemporal network

Zhiyuan XI¹, Chao TANG¹(), Anyang TONG¹, Wenjian WANG²

^1.School of Artificial Intelligence and Big Data，Hefei University，Hefei Anhui 230601，China
^2.School of Computer and Information Technology，Shanxi University，Taiyuan Shanxi 030006，China

Received:2023-06-21 Revised:2023-07-14 Accepted:2023-07-24 Online:2023-08-01 Published:2024-05-10
Contact: Chao TANG
About author:XI Zhiyuan， born in 1995， M. S. candidate. His research interests include machine learning， computer vision.
TONG Anyang， born in 1998， M. S. candidate. His research interests include machine learning， computer vision.
WANG Wenjian， born in 1968， Ph. D.， professor. Her research interests include machine learning， computational intelligence.
Supported by:
National Natural Science Foundation of China(62076154);Natural Science Foundation of Anhui Province(2008085MF202);Graduate Academic Innovation Project of Anhui Province(2022xscx145);College Student Innovation and Entrepreneurship Training Program of Anhui Province(1602307783011602432)

摘要/Abstract

摘要：

驾驶员危险驾驶行为是恶性交通事故发生的主要原因之一，因此识别驾驶员行为具有工程应用上的重要意义。目前，主流基于视觉的检测方法是对驾驶员行为的局部时空特征进行研究，针对全局空间特征及长时序相关性特征研究较少，这在一定程度上无法结合场景上下文信息对危险驾驶行为进行识别。为了解决上述问题，提出一种基于双路时空网络的驾驶员行为识别方法，整合不同时空通路的优点以提高行为特征丰富度。首先，使用一种改进的双流卷积神经网络（TSN）对时空信息进行表征学习，同时降低提取特征的稀疏性；其次，构建一种基于Transformer的串行时空网络补充长时序相关性信息；最后，联合双路时空网络进行融合决策，增强模型的鲁棒性。实验结果表明，所提方法在驾驶员疲劳检测数据集YawDD、驾驶员分心检测数据集SF-DDDD和最新驾驶员行为识别数据集SynDD1这3个公开数据集上分别取得99.85%、99.94%和98.77%的识别准确率，特别是在SynDD1上，与使用动作识别的网络MoviNet-A0相比识别准确率提升了1.64个百分点；消融实验结果也验证了该方法对驾驶员行为有较高的识别精度。

关键词: 驾驶员行为识别, 双路时空网络, 双流卷积神经网络, Transformer

Abstract:

Dangerous driving behavior of drivers is one of the main causes of vicious traffic accidents， so identifying driver’s behavior is of great significance for engineering applications. Currently， the mainstream vision-based detection methods are to study the local spatiotemporal features of driver behavior， and less research is done on global spatial features and long-term temporal correlation features， which to a certain extent cannot be combined with the scene context information to identify dangerous driving behaviors. To solve the above problems， a driver behavior recognition method based on a dual-path spatiotemporal network was proposed， which integrated the advantages of different spatiotemporal pathways to improve the richness of behavioral features. Firstly， an improved Two-Stream convolutional Network （TSN） was used to learn the spatiotemporal information for characterization while reducing the sparsity of extracted features. Secondly， a Transformer-based serial spatiotemporal network was constructed to supplement the long-term temporal correlation information. Finally， a fusion decision was made using a dual-path spatiotemporal network to enhance the robustness of the model. Experimental results show that the proposed method achieves recognition accuracies of 99.85%， 99.94% and 98.77% on three publicly available datasets： a driver fatigue detection dataset YawDD， a driver distraction detection dataset SF-DDDD （State-Farm Distracted Driver Detection Dataset）， and a the latest driver behavior recognition dataset SynDD1， respectively； especially on SynDD1， the recognition accuracy is improved by 1.64 percentage points compared to MoviNet-A0， a recognition network by motion. Ablation experimental results confirm that the proposed method has high recognition accuracy of driver behavior.

Key words: driver behavior recognition, dual-path spatiotemporal network, Two-Stream convolutional Network (TSN), Transformer

中图分类号:

TP391

席治远, 唐超, 童安炀, 王文剑. 基于双路时空网络的驾驶员行为识别[J]. 计算机应用, 2024, 44(5): 1511-1519.

Zhiyuan XI, Chao TANG, Anyang TONG, Wenjian WANG. Driver behavior recognition based on dual-path spatiotemporal network[J]. Journal of Computer Applications, 2024, 44(5): 1511-1519.

图/表 13

参考文献 35

1	OLSON R L， HANOWKI R J， HICKMAN J S， et al. Driver distraction in commercial vehicle operations： FMCSA-RRT-09-042［R］. Washington， DC： United States Department of Transportation， 2009-09-01.
2	LIU F， LI X， LV T， et al. A review of driver fatigue detection： progress and prospect［C］// Proceedings of the 2019 IEEE International Conference on Consumer Electronics. Piscataway： IEEE， 2019： 1-6. 10.1109/icce.2019.8662098
3	SIKANDER G， ANWAR S. Driver fatigue detection systems： a review［J］. IEEE Transactions on Intelligent Transportation Systems， 2019， 20（6）： 2339-2352. 10.1109/tits.2018.2868499
4	KALAYCI T E， KALAYCI E G， LECHNER G， et al. Triangulated investigation of trust in automated driving： challenges and solution approaches for data integration［J］. Journal of Industrial Information Integration， 2021， 21： 100186. 10.1016/j.jii.2020.100186
5	ABTAHI S， OMIDYEGANEH M， SHIRMOHAMMADI S， et al. YawDD： a yawning detection dataset［C］// Proceedings of the 5th ACM Multimedia Systems Conference. New York： ACM， 2014： 24-28. 10.1145/2557642.2563678
6	RAHMAN M S， VENKATACHALAPATHY A， SHARMA A， et al. Synthetic distracted driving （SynDD1） dataset for analyzing distracted behaviors and various gaze zones of a driver［J］. Data in Brief， 2022， 46： 108793. 10.1016/j.dib.2022.108793
7	KASHEVNIK A， SHCHEDRIN R， KAISER C， et al. Driver distraction detection methods： a literature review and framework［J］. IEEE Access， 2021， 9： 60063-60076. 10.1109/access.2021.3073599
8	WANG J， CHAI W， VENKATACHALAPATHY A， et al. A survey on driver behavior analysis from in-vehicle cameras［J］. IEEE Transactions on Intelligent Transportation Systems， 2022， 23（8）： 10186-10209. 10.1109/tits.2021.3126231
9	MURPHY-CHUTORIAN E， TRIVEDI M M. Head pose estimation and augmented reality tracking： an integrated system and evaluation for monitoring driver awareness［J］. IEEE Transactions on Intelligent Transportation Systems， 2010， 11（2）： 300-311. 10.1109/tits.2010.2044241
10	ZHANG W， ZHANG H. Research on distracted driving identification of truck drivers based on simulated driving experiment［J］. IOP Conference Series： Earth and Environmental Science， 2021， 638： 012039. 10.1088/1755-1315/638/1/012039
11	OHN-BAR E， MARTIN S， TAWARI A， et al. Head， eye， and hand patterns for driver activity recognition［C］// Proceedings of the 2014 22nd International Conference on Pattern Recognition. Piscataway： IEEE， 2014： 660-665. 10.1109/icpr.2014.124
12	ZHANG L， TAN B， LIU T， et al. Research on recognition of dangerous driving behavior based on support vector machine［C/OL］// Proceedings of the Twelfth International Conference on Graphics and Image Processing. Bellingham： SPIE， 2021， 11720［2023-05-01］. .
13	BRAUNAGEL C， KASNECI E， STOLZMANN W， et al. Driver-activity recognition in the context of conditionally autonomous driving［C］// Proceedings of the 2015 IEEE 18th International Conference on Intelligent Transportation Systems. Piscataway： IEEE， 2015： 1652-1657. 10.1109/itsc.2015.268
14	HOU Z， OU S， XU D. Research on fatigue driving feature detection algorithms of drivers based on machine learning［J］. Systems Science & Control Engineering， 2021， 9（1）： 167-172. 10.1080/21642583.2021.1888819
15	OKON O D， MENG L. Detecting distracted driving with deep learning［C］// Proceedings of the 2017 International Conference on Interactive Collaborative Robotics. Cham： Springer， 2017： 170-179. 10.1007/978-3-319-66471-2_19
16	KOESDWIADY A， BEDAWI S M， OU C， et al. End-to-end deep learning for driver distraction recognition［C］// Proceedings of the 2017 International Conference on Image Analysis and Recognition. Cham： Springer， 2017： 11-18. 10.1007/978-3-319-59876-5_2
17	JAIN A， KOPPULA H S， SOH S， et al. Brain4Cars： car that knows before you do via sensory-fusion deep learning architecture［EB/OL］. ［2022-12-26］. . 10.1109/iccv.2015.364
18	TONG A， TANG C， WANG W. Semi-supervised action recognition from temporal augmentation using curriculum learning［J］. IEEE Transactions on Circuits and Systems for Video Technology， 2023，33（3）： 1305-1319. 10.1109/tcsvt.2022.3210271
19	REN F， TANG C， TONG A， et al. Skeleton-based human action recognition by fusing attention based three-stream convolutional neural network and SVM［J］. Multimedia Tools and Applications， 2024， 83： 6273-6295. 10.1007/s11042-023-15334-9
20	FARNEBÄCK G. Two-frame motion estimation based on polynomial expansion［C］// Proceedings of the 13th Scandinavian Conference on Image Analysis. Berlin： Springer， 2003： 363-370. 10.1007/3-540-45103-x_50
21	TAMURA M， VISHWAKARMA R， VENNELAKANTI R. Hunting group clues with Transformers for social group activity recognition ［C］// Proceedings of the 17th European Conference on Computer Vision. Cham： Springer， 2022： 19-35. 10.1007/978-3-031-19772-7_2
22	SIMONYAN K， ZISSERMAN A. Two-stream convolutional networks for action recognition in videos［C］// Proceedings of the 27th International Conference on Neural Information Processing Systems. Cambridge： MIT Press， 2014： 568-576. 10.1002/14651858.CD001941.pub3
23	SCHOLKOPF B， SMOLA A， MULLER K R. Kernel principal component analysis ［C］// Proceedings of the 1997 International Conference on Artificial Neural Networks. Berlin： Springer， 1997： 583-588. 10.1007/bfb0020217
24	YANG J， YANG J-Y， ZHANG D， et al. Feature fusion： parallel strategy vs. serial strategy［J］. Pattern Recognition， 2003， 36（6）： 1369-1381. 10.1016/s0031-3203(02)00262-5
25	HAN M， ZHANG D J， WANG Y， et al. Dual-AI： dual-path actor interaction learning for group activity recognition［C］// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2022： 2980-2989. 10.1109/cvpr52688.2022.00300
26	VASWANI A， SHAZEER N， PARMAR N， et al. Attention is all you need［C］// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2017： 6000-6010.
27	ZHANG W， SU J. Driver yawning detection based on long short term memory networks［C］// Proceedings of the 2017 IEEE Symposium Series on Computational Intelligence. Piscataway： IEEE， 2017： 1-5. 10.1109/ssci.2017.8285343
28	DONG B-T， LIN H-Y. An on-board monitoring system for driving fatigue and distraction detection［C］// Proceedings of the 2021 22nd IEEE International Conference on Industrial Technology. Piscataway： IEEE， 2021： 850-855. 10.1109/icit46573.2021.9453676
29	YOU F， GONG Y， TU H， et al. A fatigue driving detection algorithm based on facial motion information entropy［J］. Journal of Advanced Transportation， 2020， 2020： 8851485. 10.1155/2020/8851485
30	SAURAV S， MATHUR S， SANG I， et al. Yawn detection for driver’s drowsiness prediction using bi-directional LSTM with CNN features ［C］// Proceedings of the 11th Intelligent Human Computer Interaction. Cham： Springer， 2020： 189-200. 10.1007/978-3-030-44689-5_17
31	SAVAŞ B K， BECERIKLI Y. A deep learning approach to driver fatigue detection via mouth state analyses and yawning detection［J］. IOSR Journal of Computer Engineering， 2021， 23（3）： 24-30.
32	RAO X， LIN F， CHEN Z， et al. Distracted driving recognition method based on deep convolutional neural network［J］. Journal of Ambient Intelligence and Humanized Computing， 2021， 12： 193-200. 10.1007/s12652-019-01597-4
33	UNADKAT V， SAYANI P， KAPADIA H， et al. Automated system for detecting distracted driver［C］// Proceedings of the 2018 4th International Conference on Computing Communication and Automation. Piscataway： IEEE， 2018： 1-4. 10.1109/ccaa.2018.8777709
34	MASOOD S， RAI A， AGGARWAL A， et al. Detecting distraction of drivers using convolutional neural network［J］. Pattern Recognition Letters， 2020， 139： 79-85. 10.1016/j.patrec.2017.12.023
35	H-Q NGUYEN， T-B NGUYEN， TRAN T K， et al. End-to-end deep learning-based framework for driver action recognition［C］// Proceedings of the 2022 International Conference on Multimedia Analysis and Pattern Recognition. Piscataway： IEEE， 2022： 1-6. 10.1109/mapr56351.2022.9924944

真实情况	预测情况
真实情况	P	N
P	TP	FN
N	FP	TN

真实情况	预测情况
真实情况	P	N
P	TP	FN
N	FP	TN

环境	参数	环境	参数
操作系统	Windows10	PyTorch	1.8.1
显卡	RTX3060	CUDA	11.3.1
Python	3.9

环境	参数	环境	参数
操作系统	Windows10	PyTorch	1.8.1
显卡	RTX3060	CUDA	11.3.1
Python	3.9

方法	YawDD				SF-DDDD				SynDD1
方法	Acc	P	R	F₁	Acc	P	R	F₁	Acc	P	R	F₁
CNN_temporal	95.50	95.47	95.47	95.47	84.89	84.83	84.83	84.83	86.20	86.21	86.21	86.21
CNN_spatio	99.64	99.65	99.65	99.65	99.51	99.51	99.50	99.50	97.20	97.21	97.18	97.20
TSN（a）	99.63	99.65	99.64	99.64	98.72	98.68	98.70	98.69	97.17	97.18	97.17	97.17
TSN（b）	99.70	99.68	99.71	99.70	99.46	99.44	99.41	99.42	97.22	97.23	97.23	97.23
TSN（b）-PCA	99.56	99.60	99.56	99.58	99.33	99.34	99.37	99.35	97.25	97.23	97.24	97.24
TSN（b）-KPCA	99.72	99.71	99.73	99.72	99.55	99.57	99.56	99.56	98.43	98.41	98.40	98.40
DPST（a）	99.81	99.80	99.80	99.80	99.71	99.72	99.73	99.72	98.51	98.49	98.50	98.49
DPST（b）	99.85	99.83	99.85	99.84	99.94	99.94	99.95	99.95	98.77	98.77	98.77	98.77

基于双路时空网络的驾驶员行为识别

Driver behavior recognition based on dual-path spatiotemporal network

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 13

参考文献 35

相关文章 15

编辑推荐

Metrics

方法	YawDD	SF-DDDD	SynDD1	方法	YawDD	SF-DDDD	SynDD1
LSTM^［27］	88.60	—	—	PCA+CNN^［32］	—	97.31	—
EAR+CNN^［28］	91.00	97.50	—	AlexNet^［33］	—	99.49	—
多特征融合SVM^［29］	94.32	—	—	VGG16^［34］	—	99.57	—
CNN+Bi-LSTM^［30］	96.48	—	—	MoviNet-A0^［35］	—	—	97.13
改进CNN^［31］	99.35	—	—	本文方法	99.85	99.94	98.77

[1]	方介泼, 陶重犇. 应对零日攻击的混合车联网入侵检测系统[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2763-2769.
[2]	李金金, 桑国明, 张益嘉. APK-CNN和Transformer增强的多域虚假新闻检测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2674-2682.
[3]	黄云川, 江永全, 黄骏涛, 杨燕. 基于元图同构网络的分子毒性预测[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2964-2969.
[4]	杨鑫, 陈雪妮, 吴春江, 周世杰. 结合变种残差模型和Transformer的城市公路短时交通流预测[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2947-2951.
[5]	任烈弘, 黄铝文, 田旭, 段飞. 基于DFT的频率敏感双分支Transformer多变量长时间序列预测方法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2739-2746.
[6]	贾洁茹, 杨建超, 张硕蕊, 闫涛, 陈斌. 基于自蒸馏视觉Transformer的无监督行人重识别[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2893-2902.
[7]	丁宇伟, 石洪波, 李杰, 梁敏. 基于局部和全局特征解耦的图像去噪网络[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2571-2579.
[8]	邓凯丽, 魏伟波, 潘振宽. 改进掩码自编码器的工业缺陷检测方法[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2595-2603.
[9]	杨帆, 邹窈, 朱明志, 马振伟, 程大伟, 蒋昌俊. 基于图注意力Transformer神经网络的信用卡欺诈检测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2634-2642.
[10]	李大海, 王忠华, 王振东. 结合空间域和频域信息的双分支低光照图像增强网络[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2175-2182.
[11]	黄梦源, 常侃, 凌铭阳, 韦新杰, 覃团发. 基于层间引导的低光照图像渐进增强算法[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1911-1919.
[12]	黎施彬, 龚俊, 汤圣君. 基于Graph Transformer的半监督异配图表示学习模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1816-1823.
[13]	吕锡婷, 赵敬华, 荣海迎, 赵嘉乐. 基于Transformer和关系图卷积网络的信息传播预测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1760-1766.
[14]	刘子涵, 周登文, 刘玉铠. 基于全局依赖Transformer的图像超分辨率网络[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1588-1596.
[15]	孙子文, 钱立志, 杨传栋, 高一博, 陆庆阳, 袁广林. 基于Transformer的视觉目标跟踪方法综述[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1644-1654.