With the advancement of autonomous driving technology, real-time vehicle detection has become crucial for ensuring system safety and reliability. Therefore, a lightweight detection model based on YOLOv10, named YOLOv10-LITE was designed by introducing four structural improvement modules to reduce model complexity and inference latency while maintaining detection accuracy, for real-time detection tasks in resource-constrained environments. Specifically, the Dynamic Upsampling (DySample) module was applied to enhance the resolution of feature maps while reducing computational cost; the Fast Multi-Scale Network (FastMSNet) module was used to improve multi-scale feature extraction and enhance detection performance for objects of different sizes; the Spatial Pyramid Pooling-Local Selective Kernel Attention (SPPF_LSKA) module was introduced to capture long-range dependencies effectively by combining local feature selection and global contextual modeling; the Adaptive Granular Fine-grained Channel Attention (AGFCA) module was incorporated to improve critical information perception ability through collaboration between spatial and channel attention. Experimental results on the KITTI dataset show that YOLOv10-LITE achieves a mean Average Precision (mAP) of 77.1%, which is 2.4% higher than that of YOLOv10, with the parameter count reduced by 8.7 percentage points. The above results verify the proposed model’s practicality in autonomous driving scenarios with both computational constraints and real-time demands.