Journal of Computer Applications ›› 2024, Vol. 44 ›› Issue (3): 931-937.DOI: 10.11772/j.issn.1001-9081.2023040420

• Multimedia computing and computer simulation • Previous Articles     Next Articles

Vehicle target detection by fusing event data and image frames

Yuliang ZHENG, Yunhua CHEN(), Weijie BAI, Pinghua CHEN   

  1. School of Computer Science,Guangdong University of Technology,Guangzhou Guangdong 510006,China
  • Received:2023-04-14 Revised:2023-07-24 Accepted:2023-07-26 Online:2023-12-04 Published:2024-03-10
  • Contact: Yunhua CHEN
  • About author:ZHENG Yuliang, born in 1998, M. S. candidate. His research interests include event camera, object detection, image processing.
    BAI Weijie, born in 1997, M. S. candidate. His research interests include event camera, object classification.
    CHEN Pinghua, born in 1969, Ph. D., professor. His research interests include cloud computing, recommendation systems.
  • Supported by:
    Natural Science Foundation of Guangdong Province(2021A1515012233)


郑宇亮, 陈云华(), 白伟杰, 陈平华   

  1. 广东工业大学 计算机学院,广州 510006
  • 通讯作者: 陈云华
  • 作者简介:郑宇亮(1998—),男,广东广州人,硕士研究生,CCF会员,主要研究方向:事件相机、目标检测、图像处理
  • 基金资助:


Combining event cameras with traditional cameras for vehicle target detection can not only solve the problems of over-exposure, underexposure, and motion blur in high dynamic range of traditional cameras, but also solve the problem of low detection accuracy caused by missing texture information of event cameras. Existing fusion algorithms often have problems such as high computational complexity, loss of feature information, and poor fusion results. To solve the above problems, a vehicle target detection algorithm that effectively fused event cameras and conventional cameras was proposed. Firstly, a spatio-temporal event representation based on Event Frequency (EF) and Time Surface (TS) was proposed, which encoded event data into event frames. Then, a Feature fusion module based on Channel and Spatial Attention mechanism (FCSA) was proposed to perform feature-level fusion of image frames and event frames. Finally, the prior box was optimized by using the differential evolution search algorithm to further improve the vehicle detection performance. In addition, due to the lack of public datasets containing image frames and event data, a vehicle detection dataset MVSEC-CAR was established. The experimental results show that, on the public PKU-DDD17-CAR dataset, the mean Average Precision (mAP) of the proposed algorithm is 2.6 percentage points higher than that of the second best ADF (Attention fusion Detection Framework), and it achieves a higher frame rate, effectively improving the accuracy of vehicle target detection and robustness to lighting, which validate the effectiveness of the proposed event representation, feature fusion, and prior box optimization algorithms.

Key words: event camera, vehicle target detection, attention mechanism, feature fusion, event representation


将事件相机与传统相机结合进行车辆目标检测,既能解决传统相机在高动态范围下的过度曝光与曝光不足、运动模糊等问题,又能解决事件相机由于纹理信息缺失导致的检测精度不高的问题。现有融合算法往往存在计算复杂度高、特征信息丢失以及融合效果不佳等问题。为此,提出一种有效融合事件相机和传统相机的车辆目标检测算法。首先,提出一种基于事件计数(EF)和时间面(TS)的时空事件表示,将事件数据编码成事件帧;然后,提出一种基于通道和空间注意力机制的特征级融合模块(FCSA),对图像帧和事件帧进行特征级融合;最后,利用差分进化搜索算法优化先验框,以进一步提高车辆检测性能。此外,由于包含图像帧和事件数据的公开数据集较为缺乏,建立了一个车辆检测数据集MVSEC-CAR。实验结果表明,在公开数据集PKU-DDD17-CAR上,所提算法的平均精度均值(mAP)比次优的ADF(Attention fusion Detection Framework)提高了2.6个百分点,且获得了较高的帧率,有效提升了车辆目标检测的准确性和对光照的鲁棒性,验证了所提出的事件表示、特征融合和先验框优化算法的有效性。

关键词: 事件相机, 车辆目标检测, 注意力机制, 特征融合, 事件表示

CLC Number: