计算机应用 ›› 2018, Vol. 38 ›› Issue (4): 971-977.DOI: 10.11772/j.issn.1001-9081.2017092149

• 人工智能 • 上一篇    下一篇

基于TensorFlow的俄语词汇标音系统

冯伟, 易绵竹, 马延周   

  1. 战略支援部队 信息工程大学(洛阳), 河南 洛阳 471003
  • 收稿日期:2017-09-04 修回日期:2017-11-18 出版日期:2018-04-10 发布日期:2018-04-09
  • 通讯作者: 易绵竹
  • 作者简介:冯伟(1993-),男,陕西西安人,硕士研究生,主要研究方向:自然语言处理;易绵竹(1964-),男,四川营山人,教授,博士,主要研究方向:计算语言学、语言信息处理;马延周(1977-),男,河南洛阳人,副教授,博士,主要研究方向:计算语言学、语言信息处理。
  • 基金资助:
    洛阳市社会科学规划项目(2016B285)。

Russian phonetic transcription system based on TensorFlow

FENG Wei, YI Mianzhu, MA Yanzhou   

  1. The PLA Strategic Support Force Information Engineering University(Luoyang), Luoyang Henan 471003, China
  • Received:2017-09-04 Revised:2017-11-18 Online:2018-04-10 Published:2018-04-09
  • Supported by:
    This work is partially supported by the Project of Social Science Planning of Luoyang (2016B285).

摘要: 针对俄语语音合成和语音识别系统中发音词典规模有限的问题,提出一种基于长短时记忆(LSTM)序列到序列模型的俄语词汇标音算法,同时设计实现了标音原型系统。首先,对基于SAMPA的俄语音素集进行了改进设计,使标音结果能够反映俄语单词的重音位置及元音弱化现象,并依据改进的新音素集构建了包含20 000词的俄语发音词典;然后利用TensorFlow框架实现了这一算法,该算法通过编码LSTM将俄语单词转换为固定维数的向量,再通过解码LSTM将向量转换为目标发音序列;最后,设计实现了具有交互式单词标音等功能的俄语词汇标音系统。实验结果表明,该算法在集外词测试集上的词形正确率达到了74.8%,音素正确率达到了94.5%,均高于Phonetisaurus方法。该系统能够有效为俄语发音词典的构建提供支持。

关键词: 俄语, 词汇标音, 长短时记忆网络, 序列到序列, TensorFlow

Abstract: Focusing on the limited pronunciation dictionary in Russian speech synthesis and speech recognition system, a Russian grapheme-to-phoneme algorithm based on Long Short-Term Memory (LSTM) sequence-to-sequence model was proposed, as well as a phonetic transcription system. Firstly, a new Russian phoneme set based on Speech Assessment Methods Phonetic Alphabet (SAMPA) was designed, making transcription results can reflect the stress position and vowel reduction of Russian words, and a 20 000-word Russian pronunciation dictionary was constructed according to the new phoneme set. Then, the proposed algorithm was implemented by using the TensorFlow framework, in which the Russian word was converted into a fixed-length vector by encoding LSTM, and then the vector was converted into the target pronunciation sequence by decoding LSTM. Finally, the Russian phonetic transcription system was designed and implemented. The experimental results on out-of-vocabulary test set show that the word correct rate reaches 74.8%, and the phoneme correct rate reaches 94.5%, which are higher than those of Phonetisaurus method. The system can effectively support the construction of the Russian pronunciation dictionary.

Key words: Russian, phonetic transcription, Long Short-Term Memory (LSTM), sequence-to-sequence, TensorFlow

中图分类号: