Six Degrees of freedom (6D) object pose estimation algorithm based on filter learning network was proposed to address the accuracy and real-time performance of object pose estimation for weakly textured objects in complex scenes. Firstly, standard convolutions were replaced with Blueprint Separable Convolutions (BSConv) to reduce model parameters, and GeLU (Gaussian error Linear Unit) activation functions were used to better approximate normal distributions, thereby improving the performance of the network model. Secondly, an Upsampling Filtering And Encoding information Module (UFAEM) was proposed to compensate for the loss of key upsampling information. Finally, a Global Attention Mechanism (GAM) was proposed to increase contextual information and more effectively extracted information from input feature maps. The experimental results on publicly available datasets LineMOD, YCB-Video, and Occlusion LineMOD show that the proposed algorithm significantly reduces network parameters while improving accuracy. The network parameter count of the proposed algorithm is reduced by nearly three-quarters. Using the ADD(-S) metric, the accuracy of the proposed algorithm is improved by about 1.2 percentage points compared to the Dual?Stream algorithm on lineMOD dataset, by about 5.2 percentage points compared to the DenseFusion algorithm on YCB-Video dataset, and by about 6.6 percentage points compared to the Pixel-wise Voting Network (PVNet) algorithm on Occlusion LineMOD dataset. Through experimental results, it is known that the proposed algorithm has excellent performance in estimating the pose of weakly textured objects, and has a certain degree of robustness for estimating the pose of occluded objects.