The complexity and variability of traffic scenarios challenge existing human-vehicle target detection algorithms, especially when dealing with occlusion, illumination changes and multi-scale targets, existing algorithms tend to have insufficient accuracy and low computational efficiency. To solve the above problems, an improved detection model, CDC-DETR (CPPA-DWRC-CGNET-DETR), was developed based on the RT-DETR (Real-Time DEtection TRansformer) architecture. Firstly, a Context Pre-activation Pooling Attention (CPPA) module was designed to enhance long-range dependencies and optimize feature extraction. Secondly, a Dilation-Wise Residual Connection (DWRC) module was introduced to improve multi-scale feature representation. Thirdly, a lightweight Context Guided Block (CG Block) was proposed to fuse local, surrounding, and global information and reduce computational cost. Finally, these modules were integrated to construct a high-accuracy and efficient real-time human-vehicle detection model suitable for complex traffic scenarios. Experimental results on the BDD100K dataset show that compared to RT-DETR, when the Intersection over Union (IoU) is 0.5, CDC-DETR improves the mean Average Precision (mAP) by 6.12%, increases the recall by 4.35%, and decrease the number of floating-point operations by 11.23%, enhancing computational efficiency significantly and providing an effective solution for deployment on edge devices.