工业噪声环境下多麦状态空间模型语音增强算法

doi:10.11772/j.issn.1001-9081.2019081514

计算机应用 ›› 2020, Vol. 40 ›› Issue (5): 1476-1482.DOI: 10.11772/j.issn.1001-9081.2019081514

• 虚拟现实与多媒体计算 • 上一篇下一篇

工业噪声环境下多麦状态空间模型语音增强算法

吴庆贺, 吴海锋, 沈勇, 曾玉

云南民族大学电气信息工程学院，昆明 650504

收稿日期:2019-09-05 修回日期:2019-12-11 出版日期:2020-05-10 发布日期:2020-05-15
通讯作者: 吴海锋(1977—)
作者简介:吴庆贺(1990—)，男，河南正阳人，硕士研究生，主要研究方向：语音增强、盲源信号分离、机器学习；吴海锋(1977—)，男，云南昆明人，教授，博士，主要研究方向：机器学习、移动通信、神经信号处理；沈勇(1965—)，男，云南昆明人，教授，硕士，主要研究方向：通信协议、嵌入式系统；曾玉(1981—)，女，云南昆明人，讲师，硕士，主要研究方向：无线网络控制、移动通信。
基金资助:
国家自然科学基金资助项目(61762093)；云南省应用基础研究重点项目(2018FA036)；云南省高校科技创新团队。

Speech enhancement using multi-microphone state space model under industrial noise environment

WU Qinghe, WU Haifeng, SHEN Yong, ZENG Yu

School of Electric and Informative Engineering, Yunnan Minzu University, Kunming Yunnan 650504, China

Received:2019-09-05 Revised:2019-12-11 Online:2020-05-10 Published:2020-05-15
Contact: WU Haifeng, born in 1977, Ph. D., professor. His research interests include machine learning, mobile communication, neural signal processing.
About author:WU Qinghe, born in 1990, M. S., candidate. His research interests include speech enhancement, blind source signal separation, machine learning.WU Haifeng, born in 1977, Ph. D., professor. His research interests include machine learning, mobile communication, neural signal processing.SHEN Yong, born in 1965, M. S., professor. His research interests include communication protocol, embedded system.ZENG Yu, born in 1981, M. S., lecturer. Her research interests include wireless network control, mobile communication.
Supported by:
This work is partially supported by the National Natural Science Foundation of China (61762093), the Yunnan Applied Basic Research Key Project (2018FA036), the Science and Technology Innovation Team of Yunnan Colleges and Universities.

摘要/Abstract

摘要：

在协同作业的工业环境中进行语音通信时，语音往往会淹没于工业噪声中，致使语音通信的有效性受到影响。针对这种工业噪声下的语音环境，提出了一种采用多麦克风的卡尔曼语音增强算法。该算法简化了状态空间模型(SSM)中的差分方程以降低复杂度，每个采样点实时得到去噪信号从而增强了实时性。另外，为了进一步简化复杂度，还利用最小二乘原则来对语音进行增强。实验中采用了公开数据库的语音信号和工厂噪声信号来模拟多麦下的带噪语音，将所提算法与传统算法进行了对比。实验结果表明，所提算法的输出语噪比（增强后的语音与残留噪声之比）优于传统算法约2 dB，而运行时间仅不到传统算法的2%，且延迟时间仅是毫秒级。

关键词: 工业噪声, 状态空间模型, 多麦克风, 语音增强

Abstract:

When a speech communication is performed in the industrial environment of collaborative operation, the speech is often submerged in the industrial noise. In this case, the effectiveness of speech communication is affected. For the speech environment with industrial noise, a Kalman speech enhancement algorithm using multi-microphone was proposed. In the algorithm, the difference equation in the State Space Model (SSM) was simplified to reduce the complexity, and the denoising signal was obtained in each sampling point to improve the real-time performance. In addition, to further simplify the complexity, the least square method was used to enhance the speech. In experiments, the speech signals and factory noise signals from a public database were used to simulate the noisy speech under multi-microphone environment, and the proposed algorithm was compared with the traditional algorithm. The experimental results show that the proposed algorithm has the output speech-to-noise ratio (a ratio of enhanced speech to residual noise) higher than the traditional algorithm by about 2 dB, and the running time less than 2% of that of the traditional algorithm. At the same time, the delay time of the algorithm is only several milliseconds.

Key words: industrial noise, State Space Model (SSM), multi-microphone, speech enhancement

中图分类号:

TN912.35

吴庆贺, 吴海锋, 沈勇, 曾玉. 工业噪声环境下多麦状态空间模型语音增强算法[J]. 计算机应用, 2020, 40(5): 1476-1482.

WU Qinghe, WU Haifeng, SHEN Yong, ZENG Yu. Speech enhancement using multi-microphone state space model under industrial noise environment[J]. Journal of Computer Applications, 2020, 40(5): 1476-1482.

参考文献

1 刘文举,聂帅,梁山,等 . 基于深度学习语音分离技术的研究现状与进展[J]. 自动化学报, 2016, 42(6): 819-833. LIU W J , NIE S , LIANG S , et al . Deep learning based speech separation technology and its developments[J]. Acta Automatica Sinica, 2016, 42(6):819-833.
2 WENINGER F , ERDOGAN H , WATANABE S , et al . Speech enhancement with LSTM recurrent neural networks and its application to noise-robust ASR[C]// Proceedings of the 12th International Conference on Latent Variable Analysis and Signal Separation, LNCS 9237. Cham: Springer, 2015: 91-99.
3 WENG C , YU D , SELTZER M L , et al . Deep neural networks for single-channel multi-talker speech recognition[J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2015, 23(10): 1670-1679.
4 GERKMANN T , KRAWCZYK-BECKER M , LE ROUX J . Phase processing for single-channel speech enhancement: history and recent advances[J]. IEEE Signal Processing Magazine, 2015, 32(2): 55-66.
5 PALIWAL K K , BASU A . A speech enhancement method based on Kalman filtering[C]// Proceedings of the 1987 IEEE International Conference on Acoustics, Speech, and Signal Processing. Piscataway: IEEE, 1987: 177-180.
6 KARADAGUR ANANDA REDDY C , SHANKAR N , SHREEDHAR BHAT G , et al . An individualized super-Gaussian single microphone speech enhancement for hearing aid users with smartphone as an assistive device[J]. IEEE Signal Processing Letters, 2017, 24(11): 1601-1605.
7 HAYKIN S . Neural Networks and Learning Machines[M].3rd ed. London: Pearson, 2009: 731-782.
8 YOSHIOKA T , ITO N, DELCROIX M , et al . The NTT CHiME-3 system: advances in speech enhancement and recognition for mobile multi-microphone devices[C]// Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding. Piscataway: IEEE, 2015: 13-17.
9 WANG J , XIE X , KUANG J . Microphone array speech enhancement based on tensor filtering methods[J]. China Communications, 2018, 15(4): 141-152.
10 章雒霏,张铭,李晨 . 一种新的语音和噪声活动检测算法及其在手机双麦克风消噪系统中的应用[J]. 电子与信息学报, 2016, 38(8):2020-2026. ZHANG L F, ZHANG M, LI C. A new voice and noise activity detection algorithm and its application to dual microphone noise suppression system for handset[J]. Journal of Electronics and Information Technology, 2016, 38(8): 2020-2026.
11 章雒霏,张铭,李晨 . 一种改进的手机双麦克风消噪系统[J]. 应用声学, 2017, 36(1): 32-40. ZHANG L F, ZHANG M, LI C. An improved dual microphone noise suppression system for handset[J]. Journal of Applied Acoustics, 2017, 36(1): 32-40.
12 LOIZOU P C . Speech Enhancement: Theory and Practice[M].2nd ed. Boca Raton: CRC Press, 2013: 93-133.
13 VERTELETSKAYA E , SIMAK B . Spectral subtractive type speech enhancement methods[J]. Advances in Electrical and Electronic Engineering, 2010, 8(3): 66-72.
14 ALLEN J B , BERKLEY D A , BLAUERT J . Multimicrophone signal-processing technique to remove room reverberation from speech signals[J]. The Journal of the Acoustical Society of America, 1977, 62(4): 912-915.
15 JUTTEN C , HERAULT J . Blind separation of sources Part I: an adaptive algorithm based on neurominmetic architecture[J]. Signal Processing, 1991, 24(1): 1-10.
16 SMITH D , LUKASIAK J , BURENTT I . Blind speech separation using a joint model of speech production[J]. IEEE Signal Processing Letters, 2005, 12(11): 784-787.
17 HYVARINEN A , OJA E . Independent component analysis: algorithms and applications[J]. Neural Networks, 2000, 13(4/5): 411-430.
18 张贤达 . 现代信号处理[M].3版. 北京:清华大学出版社, 2015: 359-401. (ZHANG X D. Modern Signal Processing[M].3rd ed. Beijing: Tsinghua University Press, 2015: 359-401.)
19 马大猷 . 现代声学理论基础[M]. 北京:科学出版社, 2004: 50-63. (MA D Y. The Theoretical Basis of Modern Acoustics[M]. Beijing: Science Press, 2004: 50-63.)
20 International Telecommunication Union . Perceptual objective listening quality assessment: P.863[S]. Geneva:ITU, 2019-05-23.

[1]	龙超, 曾庆宁, 罗瀛. 基于噪声抵消与波束形成的小阵语音增强[J]. 计算机应用, 2020, 40(8): 2386-2391.
[2]	王永彪, 张文喜, 王亚慧, 孔新新, 吕彤. 拉普拉斯分布下的MMSE谱减语音增强算法[J]. 计算机应用, 2020, 40(3): 878-882.
[3]	李艳生, 刘园, 张毅. 基于感知掩蔽的重构非负矩阵分解单通道语音增强算法[J]. 计算机应用, 2019, 39(3): 894-898.
[4]	葛宛营, 张天骐. 基于掩蔽估计与优化的单通道语音增强算法[J]. 计算机应用, 2019, 39(10): 3065-3070.
[5]	蒋茂松, 王冬霞, 牛芳琳, 曹玉东. 稀疏正则非负矩阵分解的语音增强算法[J]. 计算机应用, 2018, 38(4): 1176-1180.
[6]	徐文超, 王光艳, 陈雷. 改进的变步长最小均方误差电子耳蜗语音增强算法[J]. 计算机应用, 2017, 37(4): 1212-1216.
[7]	马金龙, 曾庆宁, 胡丹, 龙超, 谢先明. 基于麦克风小阵的多噪声环境语音增强算法[J]. 计算机应用, 2015, 35(8): 2341-2344.
[8]	刘艳, 倪万顺. 基于子带谱熵的仿生小波语音增强[J]. 计算机应用, 2015, 35(3): 868-871.
[9]	蔡宇郝程鹏侯朝焕. 采用子带谱减法的语音增强[J]. 计算机应用, 2014, 34(2): 567-571.
[10]	王瑜琳田学隆高雪利. 基于解相关变步长的改进型语音增强算法[J]. 计算机应用, 2013, 33(06): 1746-1749.
[11]	何志勇朱忠奎. 脉冲噪声环境下基于卡尔曼滤波的语音增强[J]. 计算机应用, 2011, 31(12): 3441-3445.
[12]	班超帆刘晓明田雨. 软判决修正下的语音增强算法在数字信号处理器上的实现与优化[J]. 计算机应用, 2011, 31(08): 2297-2300.
[13]	田玉静左红伟董玉民魏德生. Bark子带小波包自适应阈值语音去噪方法[J]. 计算机应用, 2010, 30(11): 3111-3114.
[14]	宫云梅赵晓群史仍辉. 基于语音存在概率和听觉掩蔽特性的语音增强算法[J]. 计算机应用, 2008, 28(11): 2981-2983.
[15]	马建芬李鸿燕张雪英王华奎 . 盲源分离在单通道语音增强算法中的应用[J]. 计算机应用, 2006, 26(11): 2694-2695.

工业噪声环境下多麦状态空间模型语音增强算法

Speech enhancement using multi-microphone state space model under industrial noise environment

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics