To address the issues of low accuracy and efficiency in existing object detection methods for surface defect detection of strip steel, a GS-YOLO (Gather-and-distribute-Squeeze-YOLO) model was proposed for surface defect detection. Firstly, in the backbone network, SE (Squeeze-and-Excitation) attention mechanism was incorporated to enhance the model’s capability in recognizing and locating defect features significantly. Then, the traditional convolutions in the original C3 module were replaced with Ghost convolutions, thereby reducing the model’s parameters and computational cost effectively. Finally, the GD (Gather-and-Distribute) feature fusion module was introduced in the neck part of the model to replace the traditional Path Aggregation Network (PAN) and Feature Pyramid Network (FPN) architectures, so as to ensure the continuity of feature fusion and improve the recognition accuracy of objects with different scales. Experimental results demonstrate that, compared to the original YOLOv5s, the proposed model increases the precision, recall and mAP@0.5 by 1.32, 5.18 and 2.56 percentage points respectively, and reduces the computational cost by 0.4 GFLOPs. The above verifies that the proposed method increases the detection accuracy and decreases the computational cost of the model at the same time.