《计算机应用》唯一官方网站 ›› 2026, Vol. 46 ›› Issue (4): 1283-1291.DOI: 10.11772/j.issn.1001-9081.2025040472
• 多媒体计算与计算机仿真 • 上一篇
收稿日期:2025-05-06
修回日期:2025-07-21
接受日期:2025-07-23
发布日期:2026-04-21
出版日期:2026-04-10
通讯作者:
张永宏
作者简介:严心怡(2002—),女,江苏盐城人,硕士研究生,主要研究方向:计算机视觉、深度学习基金资助:
Xinyi YAN1, Linglong ZHU2,3,4, Yonghong ZHANG1,2,4(
)
Received:2025-05-06
Revised:2025-07-21
Accepted:2025-07-23
Online:2026-04-21
Published:2026-04-10
Contact:
Yonghong ZHANG
About author:YAN Xinyi, born in 2002, M. S. candidate. Her research interests include computer vision, deep learning.Supported by:摘要:
交通场景的复杂性和多变性对现有的人车目标检测算法提出了挑战,尤其在处理遮挡、光照变化和多尺度目标时,现有算法通常精度不足且计算效率较低。为解决上述问题,在RT-DETR(Real-Time DEtection TRansformer)模型的基础上,提出一种改进型检测模型CDC-DETR(CPPA-DWRC-CGNET-DETR)。首先,设计上下文预激活池化注意力(CPPA)模块,以增强远距离依赖,优化特征提取;其次,引入膨胀残差连接(DWRC)模块,提升多尺度特征表达能力;再次,提出轻量化的上下文引导模块(CG Block),融合局部、周边和全局信息,降低计算成本;最后,结合上述模块,构建一个适用于复杂交通场景的高精度、高效率的实时人车检测模型。实验结果表明,与RT-DETR相比,在数据集BDD100K上,当交并比(IoU)为0.5时,CDC-DETR的检测平均精度均值(mAP)提高了6.12%,召回率提升了4.35%,而浮点运算量减少了11.23%,显著提高了计算效率,为边缘设备的部署提供了高效的解决方案。
中图分类号:
严心怡, 朱灵龙, 张永宏. 面向复杂交通场景的多尺度实时人车检测方法CDC-DETR[J]. 计算机应用, 2026, 46(4): 1283-1291.
Xinyi YAN, Linglong ZHU, Yonghong ZHANG. CDC-DETR: multi-scale real-time human-vehicle detection method for complex traffic scenarios[J]. Journal of Computer Applications, 2026, 46(4): 1283-1291.
| 类别 | 精度 | 召回率 | mAP0.5 | mAP0.5-0.95 | F1分数 | 准确率 |
|---|---|---|---|---|---|---|
| 整体 | 0.693 | 0.552 | 0.590 | 0.365 | 0.615 | 0.537 |
| 行人 | 0.702 | 0.491 | 0.579 | 0.256 | 0.579 | 0.690 |
| 汽车 | 0.806 | 0.721 | 0.789 | 0.474 | 0.761 | 0.840 |
| 巴士 | 0.640 | 0.474 | 0.490 | 0.376 | 0.545 | 0.550 |
| 卡车 | 0.625 | 0.524 | 0.504 | 0.354 | 0.571 | 0.580 |
表1 CDC-DETR在测试集上的评估结果(300个epoch)
Tab. 1 Evaluation results of CDC-DETR on test set with 300 epoch
| 类别 | 精度 | 召回率 | mAP0.5 | mAP0.5-0.95 | F1分数 | 准确率 |
|---|---|---|---|---|---|---|
| 整体 | 0.693 | 0.552 | 0.590 | 0.365 | 0.615 | 0.537 |
| 行人 | 0.702 | 0.491 | 0.579 | 0.256 | 0.579 | 0.690 |
| 汽车 | 0.806 | 0.721 | 0.789 | 0.474 | 0.761 | 0.840 |
| 巴士 | 0.640 | 0.474 | 0.490 | 0.376 | 0.545 | 0.550 |
| 卡车 | 0.625 | 0.524 | 0.504 | 0.354 | 0.571 | 0.580 |
| 模型 | 精度 | 召回率 | mAP0.5 | mAP0.5-0.95 | F1分数 | 参数量/106 | 浮点运算量/GFLOPs | 准确率 |
|---|---|---|---|---|---|---|---|---|
| SSD | 0.469 | 0.233 | 0.265 | 0.125 | 0.311 | 2 628 | 62.7 | 0.200 |
| Faster R-CNN | 0.338 | 0.240 | 0.178 | 0.107 | 0.281 | 2 848 | 188.2 | 0.137 |
| Mask R-CNN | 0.585 | 0.504 | 0.548 | 0.316 | 0.542 | 4 143 | 90.9 | 0.408 |
| RetinaNet | 0.633 | 0.450 | 0.507 | 0.299 | 0.526 | 3 668 | 84.5 | 0.377 |
| RTMDet | 0.454 | 0.400 | 0.348 | 0.196 | 0.425 | 2 760 | 54.1 | 0.261 |
| YOLO11 | 0.648 | 0.553 | 0.596 | 0.383 | 0.598 | 2 003 | 67.7 | 0.434 |
| YOLOv10 | 0.699 | 0.525 | 0.590 | 0.378 | 0.600 | 1 645 | 63.4 | 0.424 |
| YOLOv8 | 0.724 | 0.539 | 0.599 | 0.384 | 0.616 | 2 584 | 78.7 | 0.443 |
| YOLOv5 | 0.676 | 0.535 | 0.584 | 0.369 | 0.598 | 2 504 | 64.0 | 0.435 |
| YOLOv3 | 0.690 | 0.547 | 0.605 | 0.387 | 0.609 | 10 366 | 282.2 | 0.456 |
| RT-DETR | 0.682 | 0.529 | 0.556 | 0.338 | 0.597 | 1 987 | 57.0 | 0.481 |
| CDC-DETR | 0.693 | 0.552 | 0.590 | 0.365 | 0.615 | 2 364 | 50.6 | 0.537 |
表2 不同模型的对比实验结果
Tab. 2 Comparison experimental results of different models
| 模型 | 精度 | 召回率 | mAP0.5 | mAP0.5-0.95 | F1分数 | 参数量/106 | 浮点运算量/GFLOPs | 准确率 |
|---|---|---|---|---|---|---|---|---|
| SSD | 0.469 | 0.233 | 0.265 | 0.125 | 0.311 | 2 628 | 62.7 | 0.200 |
| Faster R-CNN | 0.338 | 0.240 | 0.178 | 0.107 | 0.281 | 2 848 | 188.2 | 0.137 |
| Mask R-CNN | 0.585 | 0.504 | 0.548 | 0.316 | 0.542 | 4 143 | 90.9 | 0.408 |
| RetinaNet | 0.633 | 0.450 | 0.507 | 0.299 | 0.526 | 3 668 | 84.5 | 0.377 |
| RTMDet | 0.454 | 0.400 | 0.348 | 0.196 | 0.425 | 2 760 | 54.1 | 0.261 |
| YOLO11 | 0.648 | 0.553 | 0.596 | 0.383 | 0.598 | 2 003 | 67.7 | 0.434 |
| YOLOv10 | 0.699 | 0.525 | 0.590 | 0.378 | 0.600 | 1 645 | 63.4 | 0.424 |
| YOLOv8 | 0.724 | 0.539 | 0.599 | 0.384 | 0.616 | 2 584 | 78.7 | 0.443 |
| YOLOv5 | 0.676 | 0.535 | 0.584 | 0.369 | 0.598 | 2 504 | 64.0 | 0.435 |
| YOLOv3 | 0.690 | 0.547 | 0.605 | 0.387 | 0.609 | 10 366 | 282.2 | 0.456 |
| RT-DETR | 0.682 | 0.529 | 0.556 | 0.338 | 0.597 | 1 987 | 57.0 | 0.481 |
| CDC-DETR | 0.693 | 0.552 | 0.590 | 0.365 | 0.615 | 2 364 | 50.6 | 0.537 |
| 序号 | CPPA | DWRC | CG Block | 参数量/106 | 浮点运算量/GFLOPs | 精度 | 召回率 | mAP0.5 | mAP0.5-0.95 | 准确率 |
|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 1 987 | 57.0 | 0.682 | 0.529 | 0.556 | 0.338 | 0.481 | |||
| 2 | √ | 2 041 | 57.4 | 0.715 | 0.533 | 0.571 | 0.348 | 0.492 | ||
| 3 | √ | 2 453 | 60.7 | 0.669 | 0.534 | 0.570 | 0.351 | 0.483 | ||
| 4 | √ | 1 652 | 47.6 | 0.669 | 0.534 | 0.560 | 0.343 | 0.511 | ||
| 5 | √ | √ | 2 507 | 61.1 | 0.702 | 0.532 | 0.571 | 0.351 | 0.508 | |
| 6 | √ | √ | 1 460 | 43.3 | 0.666 | 0.509 | 0.537 | 0.318 | 0.490 | |
| 7 | √ | √ | 2 555 | 54.9 | 0.679 | 0.552 | 0.586 | 0.358 | 0.528 | |
| 8 | √ | √ | √ | 2 364 | 50.6 | 0.693 | 0.552 | 0.590 | 0.365 | 0.537 |
表3 消融实验结果
Tab. 3 Ablation experimental results
| 序号 | CPPA | DWRC | CG Block | 参数量/106 | 浮点运算量/GFLOPs | 精度 | 召回率 | mAP0.5 | mAP0.5-0.95 | 准确率 |
|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 1 987 | 57.0 | 0.682 | 0.529 | 0.556 | 0.338 | 0.481 | |||
| 2 | √ | 2 041 | 57.4 | 0.715 | 0.533 | 0.571 | 0.348 | 0.492 | ||
| 3 | √ | 2 453 | 60.7 | 0.669 | 0.534 | 0.570 | 0.351 | 0.483 | ||
| 4 | √ | 1 652 | 47.6 | 0.669 | 0.534 | 0.560 | 0.343 | 0.511 | ||
| 5 | √ | √ | 2 507 | 61.1 | 0.702 | 0.532 | 0.571 | 0.351 | 0.508 | |
| 6 | √ | √ | 1 460 | 43.3 | 0.666 | 0.509 | 0.537 | 0.318 | 0.490 | |
| 7 | √ | √ | 2 555 | 54.9 | 0.679 | 0.552 | 0.586 | 0.358 | 0.528 | |
| 8 | √ | √ | √ | 2 364 | 50.6 | 0.693 | 0.552 | 0.590 | 0.365 | 0.537 |
| [1] | 国家统计局. 交通事故发生数(2023)[EB/OL]. [2025-07-19].. |
| National Bureau of Statistics of China. Numbers of traffic accidents (2023)[EB/OL]. [2025-07-19].. | |
| [2] | WANG Z, ZHAN J, DUAN C, et al. A review of vehicle detection techniques for intelligent vehicles[J]. IEEE Transactions on Neural Networks and Learning Systems, 2023, 34(8): 3811-3831. |
| [3] | CHEN L, LIN S, LU X, et al. Deep neural network based vehicle and pedestrian detection for autonomous driving: a survey[J]. IEEE Transactions on Intelligent Transportation Systems, 2021, 22(6): 3234-3246. |
| [4] | ZOU Z, CHEN K, SHI Z, et al. Object detection in 20 years: a survey[J]. Proceedings of the IEEE, 2023, 111(3): 257-276. |
| [5] | ALMUKHALFI H, NOOR A, NOOR T H. Traffic management approaches using machine learning and deep learning techniques: a survey[J]. Engineering Applications of Artificial Intelligence, 2024, 133(Pt B): No.108147. |
| [6] | WEI H, LIU X, XU S, et al. DWRSeg: rethinking efficient acquisition of multi-scale contextual information for real-time semantic segmentation[EB/OL]. [2025-07-19].. |
| [7] | WU T, TANG S, ZHANG R, et al. CGNet: a light-weight context guided network for semantic segmentation[J]. IEEE Transactions on Image Processing, 2021, 30: 1169-1179. |
| [8] | AZIMJONOV J, ÖZMEN A. A real-time vehicle detection and a novel vehicle tracking systems for estimating and monitoring traffic flow on highways[J]. Advanced Engineering Informatics, 2021, 50: No.101393. |
| [9] | WEI Y, TIAN Q, GUO J, et al. Multi-vehicle detection algorithm through combining Haar and HOG features[J]. Mathematics and Computers in Simulation, 2019, 155: 130-145. |
| [10] | RAZALLI H, RAMLI R, ALKAWAZ M H. Emergency vehicle recognition and classification method using HSV color segmentation[C]// Proceedings of the 16th IEEE International Colloquium on Signal Processing and Its Applications. Piscataway: IEEE, 2020: 284-289. |
| [11] | THIKE L L, THEIN T L L. Vehicle detection using upper local ternary features with SVM classification[C]// Proceedings of the 2023 IEEE Conference on Computer Applications. Piscataway: IEEE, 2023: 282-287. |
| [12] | GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]// Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2014: 580-587. |
| [13] | GIRSHICK R. Fast R-CNN[C]// Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2015: 1440-1448. |
| [14] | REN S, HE K, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. |
| [15] | LUO J Q, FANG H S, SHAO F M, et al. Multi-scale traffic vehicle detection based on Faster R-CNN with NAS optimization and feature enrichment[J]. Defence Technology, 2021, 17(4): 1542-1554. |
| [16] | GHOSH R. On-road vehicle detection in varying weather conditions using Faster R-CNN with several region proposal networks[J]. Multimedia Tools and Applications, 2021, 80(17): 25985-25999. |
| [17] | LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot MultiBox detector[C]// Proceedings of the 2016 European Conference on Computer Vision, LNCS 9905. Cham: Springer, 2016: 21-37. |
| [18] | REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 779-788. |
| [19] | LU J, HUANG T, ZHANG Q, et al. A lightweight vehicle detection network fusing feature pyramid and channel attention[J]. Internet of Things, 2024, 26: No.101166. |
| [20] | CHEN Z, GUO H, YANG J, et al. Fast vehicle detection algorithm in traffic scene based on improved SSD[J]. Measurement, 2022, 201: No.111655. |
| [21] | REN J, YANG J, ZHANG W, et al. RBS-YOLO: a vehicle detection algorithm based on multi-scale feature extraction[J]. Signal, Image and Video Processing, 2024, 18(4): 3421-3430. |
| [22] | LIU Y, HUANG Z, SONG Q, et al. PV-YOLO: a lightweight pedestrian and vehicle detection model based on improved YOLOv8[J]. Digital Signal Processing, 2025, 156(Pt B): No.104857. |
| [23] | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2017: 6000-6010. |
| [24] | CARION N, MASSA F, SYNNAEVE G, et al. End-to-end object detection with Transformers[C]// Proceedings of the 2020 European Conference on Computer Vision, LNCS 12346. Cham: Springer, 2020: 213-229. |
| [25] | ZHU X, SU W, LU L, et al. Deformable DETR: deformable Transformers for end-to-end object detection[EB/OL]. [2025-07-19].. |
| [26] | ZHAO Y, LV W, XU S, et al. DETRs beat YOLOs on real-time object detection[C]// Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2024: 16965-16974. |
| [27] | CAI X, LAI Q, WANG Y, et al. Poly kernel inception network for remote sensing detection[C]// Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2024: 27706-27716. |
| [28] | YU F, CHEN H, WANG X, et al. BDD100K: a diverse driving dataset for heterogeneous multitask learning[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 2633-2642. |
| [29] | HE K, GKIOXARI G, DOLLÁR P, et al. Mask R-CNN[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(2): 386-397. |
| [30] | LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(2): 318-327. |
| [31] | LYU C, ZHANG W, HUANG H, et al. RTMDet: an empirical study of designing real-time object detectors[EB/OL]. [2025-07-19].. |
| [1] | 郭纪新, 张婷. 基于组件协同优化剪枝的Transformer图像去雾[J]. 《计算机应用》唯一官方网站, 2026, 46(3): 933-939. |
| [2] | 黄萍, 李清, 邱海枫, 王程斯, 黄安子, 樊龙. 轻量化输电线路缺陷检测方法[J]. 《计算机应用》唯一官方网站, 2026, 46(3): 969-979. |
| [3] | 刘汉卿, 桑国明, 张益嘉. 结合密集多尺度特征融合和特征知识增强Transformer的遥感图像描述模型[J]. 《计算机应用》唯一官方网站, 2026, 46(3): 741-749. |
| [4] | 张健, 于剑波, 汤健. 基于多层预处理的城市固废焚烧状态识别方法[J]. 《计算机应用》唯一官方网站, 2026, 46(3): 940-949. |
| [5] | 姚理进, 张迪, 周丕宇, 曲志坚, 王海鹏. 基于Transformer和门控循环单元的磷酸化肽从头测序算法[J]. 《计算机应用》唯一官方网站, 2026, 46(1): 297-304. |
| [6] | 桑雨, 贡同, 赵琛, 于博文, 李思漫. 具有光度对齐的域适应夜间目标检测方法[J]. 《计算机应用》唯一官方网站, 2026, 46(1): 242-251. |
| [7] | 王丽芳, 任文婧, 郭晓东, 张荣国, 胡立华. 用于低剂量CT图像降噪的多路特征生成对抗网络[J]. 《计算机应用》唯一官方网站, 2026, 46(1): 270-279. |
| [8] | 吴俊衡, 王晓东, 何启学. 基于统计分布感知与频域双通道融合的时序预测模型[J]. 《计算机应用》唯一官方网站, 2026, 46(1): 113-123. |
| [9] | 吕景刚, 彭绍睿, 高硕, 周金. 复频域注意力和多尺度频域增强驱动的语音增强网络[J]. 《计算机应用》唯一官方网站, 2025, 45(9): 2957-2965. |
| [10] | 梁一鸣, 范菁, 柴汶泽. 基于双向交叉注意力的多尺度特征融合情感分类[J]. 《计算机应用》唯一官方网站, 2025, 45(9): 2773-2782. |
| [11] | 李进, 刘立群. 基于残差Swin Transformer的SAR与可见光图像融合[J]. 《计算机应用》唯一官方网站, 2025, 45(9): 2949-2956. |
| [12] | 王芳, 胡静, 张睿, 范文婷. 内容引导下多角度特征融合医学图像分割网络[J]. 《计算机应用》唯一官方网站, 2025, 45(9): 3017-3025. |
| [13] | 周金, 李玉芝, 张徐, 高硕, 张立, 盛家川. 复杂电磁环境下的调制识别网络[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2672-2682. |
| [14] | 陶永鹏, 柏诗淇, 周正文. 基于卷积和Transformer神经网络架构搜索的脑胶质瘤多组织分割网络[J]. 《计算机应用》唯一官方网站, 2025, 45(7): 2378-2386. |
| [15] | 陈亮, 王璇, 雷坤. 复杂场景下跨层多尺度特征融合的安全帽佩戴检测算法[J]. 《计算机应用》唯一官方网站, 2025, 45(7): 2333-2341. |
| 阅读次数 | ||||||
|
全文 |
|
|||||
|
摘要 |
|
|||||