When a speech communication is performed in the industrial environment of collaborative operation, the speech is often submerged in the industrial noise. In this case, the effectiveness of speech communication is affected. For the speech environment with industrial noise, a Kalman speech enhancement algorithm using multi-microphone was proposed. In the algorithm, the difference equation in the State Space Model (SSM) was simplified to reduce the complexity, and the denoising signal was obtained in each sampling point to improve the real-time performance. In addition, to further simplify the complexity, the least square method was used to enhance the speech. In experiments, the speech signals and factory noise signals from a public database were used to simulate the noisy speech under multi-microphone environment, and the proposed algorithm was compared with the traditional algorithm. The experimental results show that the proposed algorithm has the output speech-to-noise ratio (a ratio of enhanced speech to residual noise) higher than the traditional algorithm by about 2 dB, and the running time less than 2% of that of the traditional algorithm. At the same time, the delay time of the algorithm is only several milliseconds.
1 刘文举,聂帅,梁山,等 . 基于深度学习语音分离技术的研究现状与进展[J]. 自动化学报, 2016, 42(6): 819-833. LIU W J , NIE S , LIANG S , et al . Deep learning based speech separation technology and its developments[J]. Acta Automatica Sinica, 2016, 42(6):819-833.
2 WENINGER F , ERDOGAN H , WATANABE S , et al . Speech enhancement with LSTM recurrent neural networks and its application to noise-robust ASR[C]// Proceedings of the 12th International Conference on Latent Variable Analysis and Signal Separation, LNCS 9237. Cham: Springer, 2015: 91-99.
3 WENG C , YU D , SELTZER M L , et al . Deep neural networks for single-channel multi-talker speech recognition[J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2015, 23(10): 1670-1679.
4 GERKMANN T , KRAWCZYK-BECKER M , LE ROUX J . Phase processing for single-channel speech enhancement: history and recent advances[J]. IEEE Signal Processing Magazine, 2015, 32(2): 55-66.
5 PALIWAL K K , BASU A . A speech enhancement method based on Kalman filtering[C]// Proceedings of the 1987 IEEE International Conference on Acoustics, Speech, and Signal Processing. Piscataway: IEEE, 1987: 177-180.
6 KARADAGUR ANANDA REDDY C , SHANKAR N , SHREEDHAR BHAT G , et al . An individualized super-Gaussian single microphone speech enhancement for hearing aid users with smartphone as an assistive device[J]. IEEE Signal Processing Letters, 2017, 24(11): 1601-1605.
7 HAYKIN S . Neural Networks and Learning Machines[M].3rd ed. London: Pearson, 2009: 731-782.
8 YOSHIOKA T , ITO N, DELCROIX M , et al . The NTT CHiME-3 system: advances in speech enhancement and recognition for mobile multi-microphone devices[C]// Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding. Piscataway: IEEE, 2015: 13-17.
9 WANG J , XIE X , KUANG J . Microphone array speech enhancement based on tensor filtering methods[J]. China Communications, 2018, 15(4): 141-152.
10 章雒霏,张铭,李晨 . 一种新的语音和噪声活动检测算法及其在手机双麦克风消噪系统中的应用[J]. 电子与信息学报, 2016, 38(8):2020-2026. ZHANG L F, ZHANG M, LI C. A new voice and noise activity detection algorithm and its application to dual microphone noise suppression system for handset[J]. Journal of Electronics and Information Technology, 2016, 38(8): 2020-2026.
11 章雒霏,张铭,李晨 . 一种改进的手机双麦克风消噪系统[J]. 应用声学, 2017, 36(1): 32-40. ZHANG L F, ZHANG M, LI C. An improved dual microphone noise suppression system for handset[J]. Journal of Applied Acoustics, 2017, 36(1): 32-40.
12 LOIZOU P C . Speech Enhancement: Theory and Practice[M].2nd ed. Boca Raton: CRC Press, 2013: 93-133.
13 VERTELETSKAYA E , SIMAK B . Spectral subtractive type speech enhancement methods[J]. Advances in Electrical and Electronic Engineering, 2010, 8(3): 66-72.
14 ALLEN J B , BERKLEY D A , BLAUERT J . Multimicrophone signal-processing technique to remove room reverberation from speech signals[J]. The Journal of the Acoustical Society of America, 1977, 62(4): 912-915.
15 JUTTEN C , HERAULT J . Blind separation of sources Part I: an adaptive algorithm based on neurominmetic architecture[J]. Signal Processing, 1991, 24(1): 1-10.
16 SMITH D , LUKASIAK J , BURENTT I . Blind speech separation using a joint model of speech production[J]. IEEE Signal Processing Letters, 2005, 12(11): 784-787.
17 HYVARINEN A , OJA E . Independent component analysis: algorithms and applications[J]. Neural Networks, 2000, 13(4/5): 411-430.
18 张贤达 . 现代信号处理[M].3版. 北京:清华大学出版社, 2015: 359-401. (ZHANG X D. Modern Signal Processing[M].3rd ed. Beijing: Tsinghua University Press, 2015: 359-401.)
19 马大猷 . 现代声学理论基础[M]. 北京:科学出版社, 2004: 50-63. (MA D Y. The Theoretical Basis of Modern Acoustics[M]. Beijing: Science Press, 2004: 50-63.)
20 International Telecommunication Union . Perceptual objective listening quality assessment: P.863[S]. Geneva:ITU, 2019-05-23.