Abstract:Abstract: Aim at the fact that the existing video pedestrian re-recognition cannot effectively extract the spatiotemporal information between consecutive frames of video, this paper proposes a pedestrian re-recognition network based on non-local attention and multi-feature fusion to extract global and local characterization features and timing information. First embed the non-local attention module to extract global features; then extract the low- and middle-level features of the network and local features to form a multi-feature fusion to obtain the salient features of the pedestrian; finally, the pedestrian features are similarly measured and sorted to calculate the video pedestrian re-recognition. The proposed model has significantly improved performance on the large datasets Mars and DukeMTMC-VideoReID, with mAP values reaching 81.4% and 93.4%, and Rank-1 values reaching 88.7% and 95.3%, respectively, and on the small dataset PRID2011 Rank-1 reached 94.8%.