《计算机应用》唯一官方网站 ›› 2026, Vol. 46 ›› Issue (4): 1283-1291.DOI: 10.11772/j.issn.1001-9081.2025040472

• 多媒体计算与计算机仿真 • 上一篇    

面向复杂交通场景的多尺度实时人车检测方法CDC-DETR

严心怡1, 朱灵龙2,3,4, 张永宏1,2,4()   

  1. 1.南京信息工程大学 自动化学院,南京 210044
    2.无锡学院 物联网工程学院,江苏 无锡 214105
    3.公安部交通管理科学研究所,江苏 无锡 214151
    4.无锡学院 无锡市车联网重点实验室,江苏 无锡 214105
  • 收稿日期:2025-05-06 修回日期:2025-07-21 接受日期:2025-07-23 发布日期:2026-04-21 出版日期:2026-04-10
  • 通讯作者: 张永宏
  • 作者简介:严心怡(2002—),女,江苏盐城人,硕士研究生,主要研究方向:计算机视觉、深度学习
    朱灵龙(1993—),男,江苏南通人,副教授,博士,主要研究方向:交通大数据
  • 基金资助:
    国家自然科学基金资助项目(42175157);国家自然科学基金资助项目(42305158);江苏省无锡市“太湖之光”科技攻关计划(基础研究)项目(K20231021)

CDC-DETR: multi-scale real-time human-vehicle detection method for complex traffic scenarios

Xinyi YAN1, Linglong ZHU2,3,4, Yonghong ZHANG1,2,4()   

  1. 1.School of Automation,Nanjing University of Information Science and Technology,Nanjing Jiangsu 210044,China
    2.School of Internet of Things Engineering,Wuxi University,Wuxi Jiangsu 214105,China
    3.Traffic Management Research Institute of the Ministry of Public Security,Wuxi Jiangsu 214151,China
    4.Wuxi Key Laboratory of Telematics,Wuxi University,Wuxi Jiangsu 214105,China
  • Received:2025-05-06 Revised:2025-07-21 Accepted:2025-07-23 Online:2026-04-21 Published:2026-04-10
  • Contact: Yonghong ZHANG
  • About author:YAN Xinyi, born in 2002, M. S. candidate. Her research interests include computer vision, deep learning.
    ZHU Linglong, born in 1993, Ph. D., associate professor. His research interests include big data for transportation.
  • Supported by:
    National Natural Science Foundation of China(42175157);Program of “Light of Taihu Lake” Science and Technology (Basic Research) of Wuxi City, Jiangsu Province(K20231021)

摘要:

交通场景的复杂性和多变性对现有的人车目标检测算法提出了挑战,尤其在处理遮挡、光照变化和多尺度目标时,现有算法通常精度不足且计算效率较低。为解决上述问题,在RT-DETR(Real-Time DEtection TRansformer)模型的基础上,提出一种改进型检测模型CDC-DETR(CPPA-DWRC-CGNET-DETR)。首先,设计上下文预激活池化注意力(CPPA)模块,以增强远距离依赖,优化特征提取;其次,引入膨胀残差连接(DWRC)模块,提升多尺度特征表达能力;再次,提出轻量化的上下文引导模块(CG Block),融合局部、周边和全局信息,降低计算成本;最后,结合上述模块,构建一个适用于复杂交通场景的高精度、高效率的实时人车检测模型。实验结果表明,与RT-DETR相比,在数据集BDD100K上,当交并比(IoU)为0.5时,CDC-DETR的检测平均精度均值(mAP)提高了6.12%,召回率提升了4.35%,而浮点运算量减少了11.23%,显著提高了计算效率,为边缘设备的部署提供了高效的解决方案。

关键词: 辅助驾驶, 人车检测, Transformer, 智能交通感知, 多尺度特征融合

Abstract:

The complexity and variability of traffic scenarios challenge existing human-vehicle target detection algorithms, especially when dealing with occlusion, illumination changes and multi-scale targets, existing algorithms tend to have insufficient accuracy and low computational efficiency. To solve the above problems, an improved detection model, CDC-DETR (CPPA-DWRC-CGNET-DETR), was developed based on the RT-DETR (Real-Time DEtection TRansformer) architecture. Firstly, a Context Pre-activation Pooling Attention (CPPA) module was designed to enhance long-range dependencies and optimize feature extraction. Secondly, a Dilation-Wise Residual Connection (DWRC) module was introduced to improve multi-scale feature representation. Thirdly, a lightweight Context Guided Block (CG Block) was proposed to fuse local, surrounding, and global information and reduce computational cost. Finally, these modules were integrated to construct a high-accuracy and efficient real-time human-vehicle detection model suitable for complex traffic scenarios. Experimental results on the BDD100K dataset show that compared to RT-DETR, when the Intersection over Union (IoU) is 0.5, CDC-DETR improves the mean Average Precision (mAP) by 6.12%, increases the recall by 4.35%, and decrease the number of floating-point operations by 11.23%, enhancing computational efficiency significantly and providing an effective solution for deployment on edge devices.

Key words: assisted driving, human-vehicle detection, Transformer, intelligent traffic perception, multi-scale feature fusion

中图分类号: