[1] VIRTANEN T, SINGH R, RAJ B. Techniques for Noise Robustness in Automatic Speech Recognition[M]. New York:Wiley & Sons, 2012:228-231. [2] EPHRAIM Y, MALAH D. Speech enhancement using a minimum mean-square error log-spectral amplitude estimator[J]. IEEE Transactions on Acoustics Speech and Signal Processing, 1985, 33(2):443-445. [3] COHEN I. Optimal speech enhancement under signal presence uncertainty using log-spectral amplitude estimator[J]. IEEE Signal Processing Letters, 2002, 9(4):113-116. [4] ASTUDILLO R F, ORGLMEISTER R. Computing MMSE estimates and residual uncertainty directly in the feature domain of ASR using STFT domain speech distortion models[J]. IEEE Transactions on Acoustics Speech and Signal Processing, 2013, 21(5):1023-1034. [5] JENSEN J, TAN Z H. Minimum mean-square error estimation of Mel-frequency cepstral features theoretically consistent approach[J]. IEEE/ACM Transactions on Audio, Speech and Language Processing, 2015, 23(1):186-197. [6] INDREBO K M, POVINELLI R J, JOHNSON M T. Minimum mean-squared error estimation of Mel-frequency cepstral coefficients using a novel distortion model[J]. IEEE/ACM Transactions on Audio, Speech and Language Processing, 2008, 16(8):1654-1661. [7] LOIZOU P C. Speech Enhancement:Theory and Practice[M]. Boca Raton, FL:CRC Press, 2007:119-122. [8] DAT T H, TAKEDA K, ITAKURA F. Generalized Gamma modeling of speech and its online estimation for speech enhancement[C]//Proceedings of the 2005 IEEE International Conference on Acoustics Speech and Signal Processing. Piscataway, NJ:IEEE, 2005, 4:181-184. [9] LOTTER T, VARY P. Noise reduction by joint maximum a posteriori spectral amplitude and phase estimation with super-Gaussian speech modelling[C]//Proceedings of the 2004 European Conference on Signal Processing. Piscataway, NJ:IEEE, 2004:1457-1460. [10] ERKELENS J S, HENDRIKS R C, HEUSDENS R, et al. Minimum mean-square error estimation of discrete Fourier coefficients with generalized Gamma priors[J]. IEEE Transactions on Audio, Speech and Language Processing, 2007, 15(6):1741-1752. [11] GRADSHTEYN I S, RYZHIK I M. Table of Integrals, Series, and Products[M]. 7th ed. Cambridge, Massachusetts:Academic Press, 2007:346-353, 699-711. [12] STARK A, PALIWAL K. MMSE estimation of log-filterbank energies for robust speech recognition[J]. Speech Communication, 2011, 53(3):403-416. [13] FODOR B, FINGSCHEIDT T. MMSE speech enhancement under speech presence uncertainty assuming (generalized) Gamma speech priors throughout[C]//Proceedings of the 2012 IEEE International Conference on Acoustics Speech and Signal Processing. Piscataway, NJ:IEEE, 2012:4033-4036. [14] TRIBOLET J M, NOLL P, MCDERMOTT B, et al. A study of complexity and quality of speech waveform coders[C]//Proceedings of the 1978 IEEE International Conference on Acoustics, Speech, and Signal Processing. Piscataway, NJ:IEEE, 1978, 3:586-590. [15] RIX A W, BEERENDS J G, HOLLIER M P, et al. Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs[C]//Proceedings of the 2001 IEEE International Conference on Acoustics Speech and Signal Processing. Washing, DC:IEEE Computer Society, 2001, 2:749-752. [16] Carnegie Mellon University. Carnegie Mellon University sphinx[EB/OL].[2016-04-14]. http://cmusphinx.sourceforge.net/. [17] VARGA A, STEENEKEN H J M. Assessment for automatic speech recognition:Ⅱ. NOISEX-92:a database and an experiment to study the effect of additive noise on speech recognition systems[J]. Speech Communication, 1993, 12(93):247-251. [18] BREITHAUPT C, GERKMANN T, MARTIN R. A novel a priori SNR estimation approach based on selective cepstro-temporal smoothing[C]//Proceedings of the 2008 IEEE International Conference on Acoustics Speech and Signal Processing. Piscataway, NJ:IEEE, 2008:4897-4900. [19] HENDRIKS R C, HEUSDENS R, JENSEN J. MMSE based noise PSD tracking with low complexity[C]//Proceedings of the 2010 IEEE International Conference on Acoustics Speech and Signal Processing. Piscataway, NJ:IEEE, 2010:4266-4269. |