Speech-to-Speech Translation (S2ST) is an emerging research direction in intelligent speech field, aiming to seamlessly translate spoken language from one language into another language. With increasing demands for cross-linguistic communication, S2ST has garnered significant attention, driving continuous research. Traditional cascaded models face numerous challenges in S2ST, including error propagation, inference latency, and inability to translate languages without a writing system. To address these issues, achieving direct S2ST using end-to-end models has become a key research focus. Based on a comprehensive survey of end-to-end S2ST models, a detailed analysis and summary of various end-to-end S2ST models was provided, the existing related technologies were reviewed, and the challenges were summarized into three categories: modeling burden, data scarcity, and real-world application, with a focus on how existing work has addressed these three categories. The extensive comprehension and generative capabilities of Large Language Models (LLMs) offer new possibilities for S2ST, while simultaneously presenting additional challenges. Exploring effective applications of LLMs in S2ST was also discussed, and potential future development directions were looked forward.