Object detection has been widely applied in the field of computer vision. However, most existing methods rely on large-scale labeled data heavily, which make it difficult to address the problem of limited samples in new categories under real-world conditions. Although current Open-Vocabulary object Detection (OVD) methods have certain cross-category generalization ability, issues such as rough semantic matching and inadequate spatial localization accuracy commonly occur when facing new categories with similar structure. To overcome these issues, an object detection algorithm with few-shot learning based on YOLO-World was proposed. Firstly, Category-aware Convolution Kernel Construction Module (CCKCM) was proposed to fuse textual semantic embeddings with visual features, thereby enhancing semantic perception ability for new categories under few-shot setting. Secondly, an efficient object matching and localization mechanism was introduced by combining sliding convolution with spatial geometric constraints, thereby realizing fast matching and accurate localization of target regions while maintaining low computational complexity. Finally, an image dataset for Few-Shot Object Detection (FSOD) tasks was built, covering multiple classic scenes and object categories. Experimental results show that on the PASCAL VOC 2007+2012 dataset, the 10-shot average precision for novel classes of the proposed algorithm reaches 73.4%, which is 1.4 percentage points higher than that of FM-FSOD. It can be seen that the proposed algorithm provides a feasible technical path for the rapid recognition of objects in new categories in real-world scenarios.