| 1 | BROWN G J, COOKE M. Computational auditory scene analysis[J]. Computer Speech and Language, 1994, 8(4): 297-336.  10.1006/csla.1994.1016 | 
																													
																						| 2 | 吴镇扬,张子瑜,李想,等. 听觉场景分析的研究进展[J]. 电路与系统学报, 2001, 6(2): 68-73.  10.3969/j.issn.1007-0249.2001.02.015 | 
																													
																						|  | WU Z Y, ZHANG Z Y, LI X, et al. The research advance of auditory scene analysis[J]. Journal of Circuits and Systems, 2001, 6(2): 68-73.  10.3969/j.issn.1007-0249.2001.02.015 | 
																													
																						| 3 | CHANG J H, JO Q H, KIM D K, et al. Global soft decision employing support vector machine for speech enhancement[J]. IEEE Signal Processing Letters, 2009, 16(1): 57-60.  10.1109/lsp.2008.2008574 | 
																													
																						| 4 | WILSON K W, RAJ B, SMARAGDIS P, et al. Speech denoising using nonnegative matrix factorization with priors[C]// Proceedings of the 2008 IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway: IEEE, 2008: 4029-4032.  10.1109/icassp.2008.4518538 | 
																													
																						| 5 | WILSON K W, RAJ B, SMARAGDIS P. Regularized non-negative matrix factorization with temporal dependencies for speech denoising[C]// Proceedings of the INTERSPEECH 2008. [S.l.]: International Speech Communication Association, 2008: 411-414.  10.21437/interspeech.2008-49 | 
																													
																						| 6 | SCHMIDT M N, LARSEN J, HSIAO F T. Wind noise reduction using non-negative sparse coding[C]// Proceedings of the 2007 IEEE Workshop on Machine Learning for Signal Processing. Piscataway: IEEE, 2007: 431-436.  10.1109/mlsp.2007.4414345 | 
																													
																						| 7 | WANG Y X, WANG D L. Towards scaling up classification-based speech separation[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2013, 21(7): 1381-1390.  10.1109/tasl.2013.2250961 | 
																													
																						| 8 | 袁文浩,孙文珠,夏斌,等. 利用深度卷积神经网络提高未知噪声下的语音增强性能[J]. 自动化学报, 2018, 44(4): 751-759.  10.16383/j.aas.2018.c170001 | 
																													
																						|  | YUAN W H, SUN W Z, XIA B, et al. Improving speech enhancement in unseen noise using deep convolutional neural network[J]. Acta Automatica Sinica, 2018, 44(4): 751-759.  10.16383/j.aas.2018.c170001 | 
																													
																						| 9 | XU Y, DU J, DAI L R, et al. A regression approach to speech enhancement based on deep neural networks[J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2015, 23(1): 7-19.  10.1109/taslp.2014.2364452 | 
																													
																						| 10 | VINCENT E, GRIBONVAL R, FÉVOTTE C. Performance measurement in blind audio source separation[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2006, 14(4): 1462-1469.  10.1109/tsa.2005.858005 | 
																													
																						| 11 | WANG Y X, NARAYANAN A, WANG D L. On training targets for supervised speech separation[J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2014, 22(12): 1849-1858.  10.1109/taslp.2014.2352935 | 
																													
																						| 12 | PEARLMUTTER B A. Gradient calculations for dynamic recurrent neural networks: a survey[J]. IEEE Transactions on Neural Networks, 1995, 6(5): 1212-1228.  10.1109/72.410363 | 
																													
																						| 13 | WENINGER F, EYBEN F, SCHULLER B. Single-channel speech separation with memory-enhanced recurrent neural networks[C]// Proceedings of the 2014 IEEE International Conference on Acoustics, Speech, and Signal Processing. Piscataway: IEEE, 2014: 3709-3713.  10.1109/icassp.2014.6854294 | 
																													
																						| 14 | WENINGER F, HERSHEY J R, LE ROUX J, et al. Discriminatively trained recurrent neural networks for single-channel speech separation[C]// Proceedings of the 2014 IEEE Global Conference on Signal and Information Processing. Piscataway: IEEE, 2014: 577-581.  10.1109/globalsip.2014.7032183 | 
																													
																						| 15 | TU Y H, DU J, LEE C H. 2D-to-2D mask estimation for speech enhancement based on fully convolutional neural network[C]// Proceedings of the 2020 IEEE International Conference on Acoustics, Speech, and Signal Processing. Piscataway: IEEE, 2020: 6664-6668.  10.1109/icassp40776.2020.9054615 | 
																													
																						| 16 | COHEN I. Noise spectrum estimation in adverse environments: improved minima controlled recursive averaging[J]. IEEE Transactions on Speech and Audio Processing, 2003, 11(5): 466-475.  10.1109/tsa.2003.811544 | 
																													
																						| 17 | 屠彦辉. 复杂场景下基于深度学习的鲁棒性语音识别的研究[D]. 合肥:中国科学技术大学, 2019:111. | 
																													
																						|  | TU Y H. Research on robust speech recognition based on deep learning in adverse environment[D]. Hefei: University of Science and Technology of China, 2019: 111. | 
																													
																						| 18 | TANG H, HSU W N, GRONDIN F, et al. A study of enhancement, augmentation, and autoencoder methods for domain adaptation in distant speech recognition[C]// Proceedings of the INTERSPEECH 2018. [S.l.]: International Speech Communication Association, 2018: 2928-2932.  10.21437/interspeech.2018-2030 | 
																													
																						| 19 | GAO T, DU J, DAI L R, et al. Densely connected progressive learning for LSTM-based speech enhancement[C]// Proceedings of the 2018 IEEE International Conference on Acoustics, Speech, and Signal Processing. Piscataway: IEEE, 2018: 5054-5058.  10.1109/icassp.2018.8461861 | 
																													
																						| 20 | TU Y H, DU J, GAO T, et al. A multi-target SNR-progressive learning approach to regression based speech enhancement[J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2020, 28: 1608-1619.  10.1109/taslp.2020.2996503 | 
																													
																						| 21 | KINGMA D P, BA J L. Adam: a method for stochastic optimization[EB/OL]. (2017-01-30) [2022-01-03].. | 
																													
																						| 22 | SUN L, DU J, DAI L R, et al. Multiple-target deep learning for LSTM-RNN based speech enhancement[C]// Proceedings of the 2017 Hands-free Speech Communications and Microphone Arrays. Piscataway: IEEE, 2017: 136-140.  10.1109/hscma.2017.7895577 | 
																													
																						| 23 | ZHOU N, DU J, TU Y H, et al. A speech enhancement neural network architecture with SNR-progressive multi-target learning for robust speech recognition[C]// Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference. Piscataway: IEEE, 2019: 873-877.  10.1109/apsipaasc47483.2019.9023157 | 
																													
																						| 24 | VINCENT E, WATANABE S, NUGRAHA A A, et al. An analysis of environment, microphone and data simulation mismatches in robust speech recognition[J]. Computer Speech and Language, 2017, 46: 535-557.  10.1016/j.csl.2016.11.005 |