Journal of Computer Applications ›› 2016, Vol. 36 ›› Issue (12): 3369-3373.DOI: 10.11772/j.issn.1001-9081.2016.12.3369

Estimation algorithm of switching speech power spectrum for automatic speech recognition system

LIU Jingang, ZHOU Yi, MA Yongbao, LIU Hongqing   

  1. School of Communication and Information Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
  • Received:2016-05-25 Revised:2016-07-12 Online:2016-12-08 Published:2016-12-10
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (61501072), the Natural Science Foundation of Chongqing Science and Technology Commission (cstc2015jcyjA40027).


刘金刚, 周翊, 马永保, 刘宏清   

  1. 重庆邮电大学 通信与信息工程学院, 重庆 400065
  • 通讯作者: 刘金刚
  • 作者简介:刘金刚(1991-),男,山东诸城人,硕士研究生,主要研究方向:语音信号处理、语音增强;周翊(1974-),男,四川成都人,教授,博士,主要研究方向:自适应滤波、语音信号处理;马永保(1991-),男,甘肃武威人,硕士研究生,主要研究方向:语音信号处理、语音增强;刘宏清(1980-),男,黑龙江佳木斯人,教授,博士,主要研究方向:稀疏信号处理,阵列信号处理。
  • 基金资助:

Abstract: In order to solve the poor robust problem of Automatic Speech Recognition (ASR) system in noisy environment, a new estimation algorithm of switching speech power spectrum was proposed. Firstly, based on the assumption of the speech spectral amplitude was better modelled for a Chi distribution, a modified estimation algorithm of speech power spectrum based on Minimum Mean Square Error (MMSE) was proposed. Then incorporating the Speech Presence Probability (SPP), a new MMSE estimator based on SPP was obtained. Next, the new approach and the conventional Wiener filter were combined to develop a switch algorithm. With the heavy noise environment, the modified MMSE estimator was used to estimate the clean speech power spectrum; otherwise, the Wiener filter was employed to reduce calculating amount. The final estimation algorithm of switching speech power spectrum for ASR system was obtained. The experimental results show that,compared with the traditional MMSE estimator with Rayleigh prior, the recognition accurate of the proposed algorithm was averagely improved by 8 percentage points in various noise environments. The proposed algorithm can improve the robustness of the ASR system by removing the noise, and reduce the computational cost.

Key words: Automatic Speech Recognition (ASR) system, robustness, Minimum Mean Square Error (MMSE), Speech Presence Probability (SPP), estimation of speech power spectrum, Wiener filter

摘要: 针对语音识别系统在噪声环境下不能保持很好鲁棒性的问题,提出了一种切换语音功率谱估计算法。该算法假设语音的幅度谱服从Chi分布,提出了一种改进的基于最小均方误差(MMSE)的语音功率谱估计算法。然后,结合语音存在的概率(SPP),推导出改进的基于语音存在概率的MMSE估计器。接下来,将改进的MSME估计器与传统的维纳滤波器结合。在噪声干扰比较大时,使用改进的MMSE估计器来估计纯净语音的功率谱,当噪声干扰较小时,改用传统的维纳滤波器以减少计算量,最终得到用于识别系统的切换语音功率谱估计算法。实验结果表明,所提算法相比传统的瑞利分布下的MMSE估计器在各种噪声的情况下识别率平均提高在8个百分点左右,在去除噪声干扰、提高识别系统鲁棒性的同时,减小了语音识别系统的功耗。

关键词: 自动语音识别系统, 鲁棒性, 最小均方误差, 语音存在概率, 功率谱估计, 维纳滤波器

