[1] 曹亮, 张天骐, 高洪兴, 等. 基于听觉掩蔽效应的多频带谱减语音增强方法[J]. 计算机工程与设计, 2013, 34(1):235-240. (CAO L, ZHANG T Q, GAO H X, et al. Multi-band spectral subtraction method for speech enhancement based on masking property of human auditory system[J]. Computer Engineering and Design, 2013, 34(1):235-240.) [2] 李季碧, 马永保, 夏杰, 等. 一种基于修正倒谱平滑技术改进的维纳滤波语音增强算法[J]. 重庆邮电大学学报(自然科学版), 2016, 28(4):462-467. (LI J B, MA Y B, XIA J, et al. An improved Wiener filtering speech enhancement algorithm based on modified cepstrum smooth technology[J]. Journal of Chongqing University of Posts and Telecommunications (Natural Science Edition), 2016, 28(4):462-467.) [3] BOROWICZ A, PETROVSKY A. Signal subspace approach for psychoacoustically motivated speech enhancement[J]. Speech communication, 2011, 53(2):210-219. [4] HU K, WANG D. Unvoiced speech segregation from nonspeech interference via CASA and spectral subtraction[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2011, 19(6):1600-1609. [5] WANG Y, NARAYANAN A, WANG D, et al. On training targets for supervised speech separation[J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2014, 22(12):1849-1858. [6] BAO F, ABDULLA W H. Noise masking method based on an effective ratio mask estimation in Gammatone channels[J]. APSIPA Transactions on Signal and Information Processing, 2018, 7(e5):1-12. [7] SUN M, LI Y, GEMMEKE J F, et al. Speech enhancement under low SNR conditions via noise estimation using sparse and low-rank NMF with Kullback-Leibler divergence[J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2015, 23(7):1233-1242. [8] NAHMA L, YONG P C, DAM H H, et al. Convex combination framework for a priori SNR estimation in speech enhancement[C]//Proceedings of the 2017 IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway, NJ. IEEE, 2017:4975-4979. [9] 蒋毅, 刘润生, 冯振明. 基于听感知特性的双麦克风近讲语音增强算法[J]. 清华大学学报(自然科学版), 2014(9):1179-1183. (JIANG Y, LIU R S, FENG Z M. Dual-microphone speech enhancement algorithm based on the auditory features for a close-talk system[J]. Journal of Tsinghua University (Science and Technology), 2014, 54(9):1179-1183.) [10] BAO F, ABDULLA W H. A new ratio mask representation for CASA-based speech enhancement[J]. IEEE/ACM Transactions on Audio, Speech and Language Processing, 2019, 27(1):7-19. [11] YONG P C, NORDHOLM S, DAM H H, et al. On the optimization of sigmoid function for speech enhancement[C]//Proceedings of the 19th European Signal Processing Conference. Piscataway:IEEE, 2011:211-215. [12] CHEN Z, HOHMANN V. Online monaural speech enhancement based on periodicity analysis and a priori SNR estimation[J]. IEEE/ACM Transactions on Audio, Speech and Language Processing, 2015, 23(11):1904-1916. [13] ZHENG C, TAN Z, PENG R, et al. Guided spectrogram filtering for speech dereverberation[J]. Applied Acoustics, 2018, 134(5):154-159. [14] GAROFOLO J S, LAMEL L F, FISHER W M, et al. TIMIT Acoustic-Phonetic Continuous Speech Corpus[EB/OL].[2019-01-12]. https://catalog.ldc.upenn.edu/LDC93S1. [15] VARGA A, STEENEKEN H J M. Assessment for automatic speech recognition Ⅱ:NOISEX-92:a database and an experiment to study the effect of additive noise on speech recognition systems[J]. Speech Communication, 1993, 12(3):247-251. [16] GERKMANN T, HENDRIKS R C. Unbiased MMSE-based noise power estimation with low complexity and low tracking delay[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2012, 20(4):1383-1393. [17] International Telecommunications Union (ITU). Perceptual Evaluation of Speech Quality (PESQ):an objective method for end-to-end speech quality assessment of narrow-band telephone networks and speech codecs[EB/OL].[2019-01-12]. https://www.itu.int/rec/T-REC-P.862-200102-I/en. [18] TAAL C H, HENDRIKS R C, HEUSDENS R, et al. An algorithm for intelligibility prediction of time-frequency weighted noisy speech[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2011, 19(7):2125-2136. [19] LOIZOU P C, KIM G. Reasons why current speech-enhancement algorithms do not improve speech intelligibility and suggested solutions[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2011, 19(1):47-56. |