Vehicle target detection by fusing event data and image frames

doi:10.11772/j.issn.1001-9081.2023040420

Journal of Computer Applications ›› 2024, Vol. 44 ›› Issue (3): 931-937.DOI: 10.11772/j.issn.1001-9081.2023040420

Special Issue: 多媒体计算与计算机仿真

• Multimedia computing and computer simulation • Previous Articles Next Articles

Vehicle target detection by fusing event data and image frames

Yuliang ZHENG, Yunhua CHEN(), Weijie BAI, Pinghua CHEN

School of Computer Science，Guangdong University of Technology，Guangzhou Guangdong 510006，China

Received:2023-04-14 Revised:2023-07-24 Accepted:2023-07-26 Online:2023-12-04 Published:2024-03-10
Contact: Yunhua CHEN
About author:ZHENG Yuliang， born in 1998， M. S. candidate. His research interests include event camera， object detection， image processing.
BAI Weijie， born in 1997， M. S. candidate. His research interests include event camera， object classification.
CHEN Pinghua， born in 1969， Ph. D.， professor. His research interests include cloud computing， recommendation systems.
Supported by:
Natural Science Foundation of Guangdong Province(2021A1515012233)

融合事件数据和图像帧的车辆目标检测

郑宇亮, 陈云华(), 白伟杰, 陈平华

广东工业大学计算机学院，广州 510006

通讯作者: 陈云华
作者简介:郑宇亮（1998—），男，广东广州人，硕士研究生，CCF会员，主要研究方向：事件相机、目标检测、图像处理
白伟杰（1997—），男，河南南阳人，硕士，CCF会员，主要研究方向：事件相机、目标分类
陈平华（1969—），男，湖南株洲人，教授，博士，主要研究方向：云计算、推荐系统。
基金资助:
广东省自然科学基金资助项目(2021A1515012233)

Abstract

Abstract:

Combining event cameras with traditional cameras for vehicle target detection can not only solve the problems of over-exposure， underexposure， and motion blur in high dynamic range of traditional cameras， but also solve the problem of low detection accuracy caused by missing texture information of event cameras. Existing fusion algorithms often have problems such as high computational complexity， loss of feature information， and poor fusion results. To solve the above problems， a vehicle target detection algorithm that effectively fused event cameras and conventional cameras was proposed. Firstly， a spatio-temporal event representation based on Event Frequency （EF） and Time Surface （TS） was proposed， which encoded event data into event frames. Then， a Feature fusion module based on Channel and Spatial Attention mechanism （FCSA） was proposed to perform feature-level fusion of image frames and event frames. Finally， the prior box was optimized by using the differential evolution search algorithm to further improve the vehicle detection performance. In addition， due to the lack of public datasets containing image frames and event data， a vehicle detection dataset MVSEC-CAR was established. The experimental results show that， on the public PKU-DDD17-CAR dataset， the mean Average Precision （mAP） of the proposed algorithm is 2.6 percentage points higher than that of the second best ADF （Attention fusion Detection Framework）， and it achieves a higher frame rate， effectively improving the accuracy of vehicle target detection and robustness to lighting， which validate the effectiveness of the proposed event representation， feature fusion， and prior box optimization algorithms.

Key words: event camera, vehicle target detection, attention mechanism, feature fusion, event representation

摘要：

将事件相机与传统相机结合进行车辆目标检测，既能解决传统相机在高动态范围下的过度曝光与曝光不足、运动模糊等问题，又能解决事件相机由于纹理信息缺失导致的检测精度不高的问题。现有融合算法往往存在计算复杂度高、特征信息丢失以及融合效果不佳等问题。为此，提出一种有效融合事件相机和传统相机的车辆目标检测算法。首先，提出一种基于事件计数（EF）和时间面（TS）的时空事件表示，将事件数据编码成事件帧；然后，提出一种基于通道和空间注意力机制的特征级融合模块（FCSA），对图像帧和事件帧进行特征级融合；最后，利用差分进化搜索算法优化先验框，以进一步提高车辆检测性能。此外，由于包含图像帧和事件数据的公开数据集较为缺乏，建立了一个车辆检测数据集MVSEC-CAR。实验结果表明，在公开数据集PKU-DDD17-CAR上，所提算法的平均精度均值（mAP）比次优的ADF（Attention fusion Detection Framework）提高了2.6个百分点，且获得了较高的帧率，有效提升了车辆目标检测的准确性和对光照的鲁棒性，验证了所提出的事件表示、特征融合和先验框优化算法的有效性。

关键词: 事件相机, 车辆目标检测, 注意力机制, 特征融合, 事件表示

CLC Number:

TP391.41

Yuliang ZHENG, Yunhua CHEN, Weijie BAI, Pinghua CHEN. Vehicle target detection by fusing event data and image frames[J]. Journal of Computer Applications, 2024, 44(3): 931-937.

郑宇亮, 陈云华, 白伟杰, 陈平华. 融合事件数据和图像帧的车辆目标检测[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 931-937.

Figures/Tables 9

References 30

1	MAO Q-C， SUN H-M， ZUO L-Q， et al. Finding every car： a traffic surveillance multi-scale vehicle object detection method ［J］. Applied Intelligence， 2020， 50： 3125-3136. 10.1007/s10489-020-01704-5
2	XIAO J， WU Y， CHEN Y， et al. LSTFE-Net： Long short-term feature enhancement network for video small object detection ［C］// Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2023： 14613-14622. 10.1109/cvpr52729.2023.01404
3	XIAO J， GUO H， ZHOU J， et al. Tiny object detection with context enhancement and feature purification ［J］. Expert Systems with Applications， 2023， 211： 118665. 10.1016/j.eswa.2022.118665
4	LICHTSTEINER P， POSCH C， DELBRUCK T. A 128×128 120 dB 15 µs latency asynchronous temporal contrast vision sensor ［J］. IEEE Journal of Solid-State Circuits， 2008，43（2）：566-576. 10.1109/jssc.2007.914337
5	CHEN G， CAO H， CONRADT J， et al. Event-based neuromorphic vision for autonomous driving： a paradigm shift for bio-inspired visual sensing and perception［J］. IEEE Signal Processing Magazine， 2020，37（4）：34-49. 10.1109/msp.2020.2985815
6	ZHANG J， DONG B， ZHANG H， et al. Spiking transformers for event-based single object tracking ［C］// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2022： 8791-8800. 10.1109/cvpr52688.2022.00860
7	蔡志浩，陈文军，赵江，等. 基于动态视觉传感器的无人机目标检测与避障［J/OL］.北京航空航天大学学报，2022：1-15 ［2023-12-15］..
	CAI Z H， CHEN W J， ZHAO J， et al. Object detection and obstacle avoidance based on dynamic vision sensor for UAV ［J/OL］. Journal of Beijing University of Aeronautics and Astronautics， 2022：1-15 ［2023-12-15］..
8	JIANG Z， XIA P， HUANG K， et al. Mixed frame-/event-driven fast pedestrian detection ［C］// Proceedings of the 2019 International Conference on Robotics and Automation. Piscataway： IEEE， 2019： 8332-8338. 10.1109/icra.2019.8793924
9	LI J， DONG S， YU Z， et al. Event-based vision enhanced： A joint detection framework in autonomous driving ［C］// Proceedings of the 2019 IEEE International Conference on Multimedia and Expo. Piscataway： IEEE， 2019： 1396-1401. 10.1109/icme.2019.00242
10	CAO H， CHEN G， XIA J， et al. Fusion-based feature attention gate component for vehicle detection based on event camera ［J］. IEEE Sensors Journal， 2021， 21（21）： 24540-24548. 10.1109/jsen.2021.3115016
11	LIU M， QI N， SHI Y， et al. An attention fusion network for event-based vehicle object detection ［C］// Proceedings of the 2021 IEEE International Conference on Image Processing. Piscataway： IEEE， 2021： 3363-3367. 10.1109/icip42928.2021.9506561
12	REDMON J， FARHADI A. YOLOv3： an incremental improvement ［EB/OL］. （2018-04-08）［2023-04-13］. . 10.1109/cvpr.2017.690
13	BENJDIRA B， KHURSHEED T， KOUBAA A， et al. Car detection using unmanned aerial vehicles： comparison between faster R-CNN and YOLOv3 ［C］// Proceedings of the 2019 International Conference on Unmanned Vehicle Systems-Oman. Piscataway： IEEE， 2019： 1-6. 10.1109/uvs.2019.8658300
14	BOX G E P， TIAO G C. Bayesian Inference in Statistical Analysis［M］. New York： John Wiley & Sons， 2011：149-316.
15	PAREDES-VALLÉS F， SCHEPER K Y W， DE CROON G C H E. Unsupervised learning of a hierarchical spiking neural network for optical flow estimation： from events to global motion perception［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2020， 42（8）： 2051-2064. 10.1109/tpami.2019.2903179
16	TAVANAEI A， GHODRATI M， KHERADPISHEH S R， et al. Deep learning in spiking neural networks ［J］. Neural Networks， 2019， 111： 47-63. 10.1016/j.neunet.2018.12.002
17	LEE H， KWON H， ROBINSON R M， et al. Dynamic belief fusion for object detection ［C］// Proceedings of the 2016 IEEE Winter Conference on Applications of Computer Vision. Piscataway： IEEE， 2016： 1-9. 10.1109/wacv.2016.7477574
18	LIN T-Y， GOYAL P， GIRSHICK R， et al. Focal loss for dense object detection ［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE，2017： 2999-3007. 10.1109/iccv.2017.324
19	ANEESH A N， SHINE L， PRADEEP R， et al. Real-time traffic light detection and recognition based on deep RetinaNet for self driving cars ［C］// Proceedings of the 2019 2nd International Conference on Intelligent Computing， Instrumentation and Control Technologies. Piscataway： IEEE， 2019： 1554-1557. 10.1109/icicict46008.2019.8993293
20	HU J， SHEN L， SUN G. Squeeze-and-excitation networks ［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 7132-7141. 10.1109/cvpr.2018.00745
21	ZHU A Z， THAKUR D， ÖZASLAN T， et al. The multivehicle stereo event camera dataset： an event camera dataset for 3D perception ［J］. IEEE Robotics and Automation Letters， 2018， 3（3）： 2032-2039. 10.1109/lra.2018.2800793
22	STORN R， PRICE K. Differential evolution — a simple and efficient heuristic for global optimization over continuous spaces［J］. Journal of Global Optimization， 1997， 11： 341-359. 10.1023/a:1008202821328
23	POSCH C， MATOLIN D， WOHLGENANNT R. An asynchronous time-based image sensor ［C］// Proceedings of the 2008 IEEE International Symposium on Circuits and Systems. Piscataway： IEEE， 2008： 2130-2133. 10.1109/iscas.2008.4541871
24	BRANDLI C， BERNER R， YANG M H， et al. A 240×180 130 dB 3 μs latency global shutter spatiotemporal vision sensor［J］. IEEE Journal of Solid-State Circuits， 2014， 49（10）：2333-2341. 10.1109/jssc.2014.2342715
25	ANGOTZI G N， BOI F， LECOMTE A， et al. SiNAPS： an implantable active pixel sensor CMOS-probe for simultaneous large-scale neural recordings ［J］. Biosensors and Bioelectronics， 2019， 126： 355-364. 10.1016/j.bios.2018.10.032
26	MOEYS D P， CORRADI F， LI C， et al. A sensitive dynamic and active pixel vision sensor for color or neural imaging applications［J］. IEEE Transactions on Biomedical Circuits and Systems， 2017， 12（1）： 123-136. 10.1109/tbcas.2017.2759783
27	CHEN G， CAO H， YE C， et al. Multi-cue event information fusion for pedestrian detection with neuromorphic vision sensors［J］. Frontiers in Neurorobotics， 2019， 13： 10. 10.3389/fnbot.2019.00010
28	BALDWIN R W， ALMATRAFI M， KAUFMAN J R， et al. Inceptive event time-surfaces for object classification using neuromorphic cameras ［C］// Proceedings of the 2019 International Conference on Image Analysis and Recognition. Cham：Springer，2019：395-403. 10.1007/978-3-030-27272-2_35
29	BINAS J， NEIL D， LIU S-C， et al. DDD17： end-to-end DAVIS driving dataset ［EB/OL］. （2017-11-04）［2023-04-13］. .
30	WU W， LIU H， LI L， et al. Application of local fully convolutional neural network combined with YOLOv5 algorithm in small target detection of remote sensing image ［J］. PLoS ONE， 2021， 16（10）： e0259283. 10.1371/journal.pone.0259283

输入	mAP/%		帧率/（frame·s^-1）
输入	PKU-DDD17-CAR	MVSEC-CAR	帧率/（frame·s^-1）
APS	88.6	69.5	14
Event	45.8	41.3	14
APS+Event	89.5	71.3	12

输入	mAP/%		帧率/（frame·s^-1）
输入	PKU-DDD17-CAR	MVSEC-CAR	帧率/（frame·s^-1）
APS	88.6	69.5	14
Event	45.8	41.3	14
APS+Event	89.5	71.3	12

序号	算法	PKU-DDD17-CAR		MVSEC-CAR
序号	算法	mAP/%	帧率/（frame·s^-1）	mAP/%	帧率/（frame·s^-1）
1	None+Cat	78.4	12	66.1	12
2	Anchor_opt+Cat	78.6	12	66.6	12
3	None+FCSA	86.1	12	69.7	12
4	Anchor_opt+FCSA	89.5	12	71.3	12

序号	算法	PKU-DDD17-CAR		MVSEC-CAR
序号	算法	mAP/%	帧率/（frame·s^-1）	mAP/%	帧率/（frame·s^-1）
1	None+Cat	78.4	12	66.1	12
2	Anchor_opt+Cat	78.6	12	66.6	12
3	None+FCSA	86.1	12	69.7	12
4	Anchor_opt+FCSA	89.5	12	71.3	12

算法	框架	mAP/%			帧率/（frame·s^-1）
算法	框架	日间	夜间	全部	帧率/（frame·s^-1）
JDF^［9］	Faster-RCNN	90.8	83.3	86.6	3
	SSD	—	—	75.9	12
	YOLOv2	—	—	77.8	15
	YOLOv3	—	—	84.1	9
FAGC^［10］	RetinaNet	80.5	86.2	81.6	8
ADF^［11］	Gaussian-YOLOv3	—	—	86.9	—
本文算法	RetinaNet	89.8	89.4	89.5	12

Vehicle target detection by fusing event data and image frames

融合事件数据和图像帧的车辆目标检测

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 9

References 30

Related Articles 15

Recommended Articles

Metrics

事件表示	输入	PKU-DDD17-CAR	MVSEC-CAR
TS	Event	44.7	35.3
EF	Event	44.8	37.9
本文算法	Event	45.8	41.3

[1]	Yexin PAN, Zhe YANG. Optimization model for small object detection based on multi-level feature bidirectional fusion [J]. Journal of Computer Applications, 2024, 44(9): 2871-2877.
[2]	Zhiqiang ZHAO, Peihong MA, Xinhong HEI. Crowd counting method based on dual attention mechanism [J]. Journal of Computer Applications, 2024, 44(9): 2886-2892.
[3]	Jing QIN, Zhiguang QIN, Fali LI, Yueheng PENG. Diagnosis of major depressive disorder based on probabilistic sparse self-attention neural network [J]. Journal of Computer Applications, 2024, 44(9): 2970-2974.
[4]	Liting LI, Bei HUA, Ruozhou HE, Kuang XU. Multivariate time series prediction model based on decoupled attention mechanism [J]. Journal of Computer Applications, 2024, 44(9): 2732-2738.
[5]	Kaipeng XUE, Tao XU, Chunjie LIAO. Multimodal sentiment analysis network with self-supervision and multi-layer cross attention [J]. Journal of Computer Applications, 2024, 44(8): 2387-2392.
[6]	Pengqi GAO, Heming HUANG, Yonghong FAN. Fusion of coordinate and multi-head attention mechanisms for interactive speech emotion recognition [J]. Journal of Computer Applications, 2024, 44(8): 2400-2406.
[7]	Zhonghua LI, Yunqi BAI, Xuejin WANG, Leilei HUANG, Chujun LIN, Shiyu LIAO. Low illumination face detection based on image enhancement [J]. Journal of Computer Applications, 2024, 44(8): 2588-2594.
[8]	Shangbin MO, Wenjun WANG, Ling DONG, Shengxiang GAO, Zhengtao YU. Single-channel speech enhancement based on multi-channel information aggregation and collaborative decoding [J]. Journal of Computer Applications, 2024, 44(8): 2611-2617.
[9]	Wu XIONG, Congjun CAO, Xuefang SONG, Yunlong SHAO, Xusheng WANG. Handwriting identification method based on multi-scale mixed domain attention mechanism [J]. Journal of Computer Applications, 2024, 44(7): 2225-2232.
[10]	Huanhuan LI, Tianqiang HUANG, Xuemei DING, Haifeng LUO, Liqing HUANG. Public traffic demand prediction based on multi-scale spatial-temporal graph convolutional network [J]. Journal of Computer Applications, 2024, 44(7): 2065-2072.
[11]	Dianhui MAO, Xuebo LI, Junling LIU, Denghui ZHANG, Wenjing YAN. Chinese entity and relation extraction model based on parallel heterogeneous graph and sequential attention mechanism [J]. Journal of Computer Applications, 2024, 44(7): 2018-2025.
[12]	Li LIU, Haijin HOU, Anhong WANG, Tao ZHANG. Generative data hiding algorithm based on multi-scale attention [J]. Journal of Computer Applications, 2024, 44(7): 2102-2109.
[13]	Song XU, Wenbo ZHANG, Yifan WANG. Lightweight video salient object detection network based on spatiotemporal information [J]. Journal of Computer Applications, 2024, 44(7): 2192-2199.
[14]	Dahai LI, Zhonghua WANG, Zhendong WANG. Dual-branch low-light image enhancement network combining spatial and frequency domain information [J]. Journal of Computer Applications, 2024, 44(7): 2175-2182.
[15]	Wenliang WEI, Yangping WANG, Biao YUE, Anzheng WANG, Zhe ZHANG. Deep learning model for infrared and visible image fusion based on illumination weight allocation and attention [J]. Journal of Computer Applications, 2024, 44(7): 2183-2191.