Text-based person retrieval aims to identify specific person using textual descriptions as queries. The existing state-of-the-art methods typically design multiple alignment mechanisms to achieve correspondence among cross-modal data at both global and local levels, but they neglect the mutual influence among these mechanisms. To address this, a multi-granularity shared semantic center association mechanism was proposed to explore the promoting and inhibiting effects between global and local alignments. Firstly, a multi-granularity cross-alignment module was introduced to enhance interactions of image-sentence and local region-word, achieving multi-level alignment of the cross-modal data in a joint embedding space. Then, a shared semantic center was established and served as a learnable semantic hub, and associations among global and local features were used to enhance semantic consistency among different alignment mechanisms and promote the collaborative effect of global and local features. In the shared semantic center, the local and global cross-modal similarity relationships among image and text features were calculated, providing a complementary measure from both global and local perspectives and maximizing positive effects among multiple alignment mechanisms. Finally, experiments were carried out on CUHK-PEDES dataset. Results show that the proposed method improves the Rank-1 by 8.69 percentage points and the mean Average Precision (mAP) by 6.85 percentage points compared to the baseline method significantly. The proposed method also achieves excellent performance on ICFG-PEDES and RSTPReid datasets, significantly surpassing all the compared methods.