计算机应用 ›› 2020, Vol. 40 ›› Issue (5): 1476-1482.DOI: 10.11772/j.issn.1001-9081.2019081514

• 虚拟现实与多媒体计算 • 上一篇    下一篇

工业噪声环境下多麦状态空间模型语音增强算法

吴庆贺, 吴海锋, 沈勇, 曾玉   

  1. 云南民族大学 电气信息工程学院,昆明 650504
  • 收稿日期:2019-09-05 修回日期:2019-12-11 出版日期:2020-05-10 发布日期:2020-05-15
  • 通讯作者: 吴海锋(1977—)
  • 作者简介:吴庆贺(1990—),男,河南正阳人,硕士研究生,主要研究方向:语音增强、盲源信号分离、机器学习; 吴海锋(1977—),男,云南昆明人,教授,博士,主要研究方向:机器学习、移动通信、神经信号处理; 沈勇(1965—),男,云南昆明人,教授,硕士,主要研究方向:通信协议、嵌入式系统; 曾玉(1981—),女,云南昆明人,讲师,硕士,主要研究方向:无线网络控制、移动通信。
  • 基金资助:

    国家自然科学基金资助项目(61762093);云南省应用基础研究重点项目(2018FA036);云南省高校科技创新团队。

Speech enhancement using multi-microphone state space model under industrial noise environment

WU Qinghe, WU Haifeng, SHEN Yong, ZENG Yu   

  1. School of Electric and Informative Engineering, Yunnan Minzu University, Kunming Yunnan 650504, China
  • Received:2019-09-05 Revised:2019-12-11 Online:2020-05-10 Published:2020-05-15
  • Contact: WU Haifeng, born in 1977, Ph. D., professor. His research interests include machine learning, mobile communication, neural signal processing.
  • About author:WU Qinghe, born in 1990, M. S., candidate. His research interests include speech enhancement, blind source signal separation, machine learning.WU Haifeng, born in 1977, Ph. D., professor. His research interests include machine learning, mobile communication, neural signal processing.SHEN Yong, born in 1965, M. S., professor. His research interests include communication protocol, embedded system.ZENG Yu, born in 1981, M. S., lecturer. Her research interests include wireless network control, mobile communication.
  • Supported by:

    This work is partially supported by the National Natural Science Foundation of China (61762093), the Yunnan Applied Basic Research Key Project (2018FA036), the Science and Technology Innovation Team of Yunnan Colleges and Universities.

摘要:

在协同作业的工业环境中进行语音通信时,语音往往会淹没于工业噪声中,致使语音通信的有效性受到影响。针对这种工业噪声下的语音环境,提出了一种采用多麦克风的卡尔曼语音增强算法。该算法简化了状态空间模型(SSM)中的差分方程以降低复杂度,每个采样点实时得到去噪信号从而增强了实时性。另外,为了进一步简化复杂度,还利用最小二乘原则来对语音进行增强。实验中采用了公开数据库的语音信号和工厂噪声信号来模拟多麦下的带噪语音,将所提算法与传统算法进行了对比。实验结果表明,所提算法的输出语噪比(增强后的语音与残留噪声之比)优于传统算法约2 dB,而运行时间仅不到传统算法的2%,且延迟时间仅是毫秒级。

关键词: 工业噪声, 状态空间模型, 多麦克风, 语音增强

Abstract:

When a speech communication is performed in the industrial environment of collaborative operation, the speech is often submerged in the industrial noise. In this case, the effectiveness of speech communication is affected. For the speech environment with industrial noise, a Kalman speech enhancement algorithm using multi-microphone was proposed. In the algorithm, the difference equation in the State Space Model (SSM) was simplified to reduce the complexity, and the denoising signal was obtained in each sampling point to improve the real-time performance. In addition, to further simplify the complexity, the least square method was used to enhance the speech. In experiments, the speech signals and factory noise signals from a public database were used to simulate the noisy speech under multi-microphone environment, and the proposed algorithm was compared with the traditional algorithm. The experimental results show that the proposed algorithm has the output speech-to-noise ratio (a ratio of enhanced speech to residual noise) higher than the traditional algorithm by about 2 dB, and the running time less than 2% of that of the traditional algorithm. At the same time, the delay time of the algorithm is only several milliseconds.

Key words: industrial noise, State Space Model (SSM), multi-microphone, speech enhancement

中图分类号: