In view of the challenges encountered in vehicle recognition tasks, including the artificial combinations ignoring the information of other parts and the precision decline caused by inconsistent values in multi-network fusion, an adaptive collaborative optimization algorithm based on Foundation for Intelligent Physical Agents (FIPA) model was proposed for vehicle recognition. Firstly, in the feature extraction process, SPace-to-Depth-Convolution (SPD-Conv) was used to replace the standard convolutional layer in YOLOv8 to solve the problem of fine-grained information loss and ineffective detection. Secondly, an adaptive collaborative optimization network was designed to mine the vehicle parts carrying effective information and solve the problem of disordered competition among agents. Finally, a weighted log-polar voting mechanism based on FIPA model was introduced to integrate the short-distance and long-distance fine-grained information to solve the problem of precision decline caused by inconsistent values of agents in the fusion process. Experimental results on DeepCar5.0 dataset show that compared with YOLOv5, the mean Average Precision (mAP) with Intersection over Union (IoU) of 0.5 of the proposed algorithm is improved by 1.80 percentage points in the object detection stage; in the classification fusion stage, the classification accuracy of the proposed algorithm is improved by 6.48 percentage points, and the classification accuracy is further improved by 7.53 percentage points through adding the preprocessing block.