%0 Journal Article %A QIU Zeyu %A QU Dan %A ZHANG Lianhai %T End-to-end speech synthesis based on WaveNet %D 2019 %R 10.11772/j.issn.1001-9081.2018102131 %J Journal of Computer Applications %P 1325-1329 %V 39 %N 5 %X Griffin-Lim algorithm is widely used in end-to-end speech synthesis with phase estimation, which always produces obviously artificial speech with low fidelity. Aiming at this problem, a system for end-to-end speech synthesis based on WaveNet network architecture was proposed. Based on Seq2Seq (Sequence-to-Sequence) structure, firstly the input text was converted into a one-hot vector, then, the attention mechanism was introduced to obtain a Mel spectrogram, finally WaveNet network was used to reconstruct phase information to generate time-domain waveform samples from the Mel spectrogram features. Aiming at English and Chinese, the proposed method achieves a Mean Opinion Score (MOS) of 3.31 on LJSpeech-1.0 corpus and 3.02 on THchs-30 corpus, which outperforms the end-to-end systems based on Griffin-Lim algorithm and parametric systems in terms of naturalness. %U http://www.joca.cn/EN/10.11772/j.issn.1001-9081.2018102131