[1] |
蔡汉添,袁波涛. 一种基于听觉掩蔽模型的语音增强算法[J]. 通信学报, 2002, 23(8): 93-98.
|
|
CAI H T, YUAN B T. A speech enhancement algorithm based on masking properties of human auditory system [J]. Journal on Communications, 2002, 23(8): 93-98.
|
[2] |
WANG Y, BROOKES M. Model-based speech enhancement in the modulation domain [J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2018, 26(3): 580-594.
|
[3] |
ALMAJAI I, MILNER B. Visually derived wiener filters for speech enhancement [J]. IEEE Transactions on Audio, Speech, and Language Processing, 2011, 19(6):1642-1651.
|
[4] |
蓝天,彭川,李森,等. 单声道语音降噪与去混响研究综述[J]. 计算机研究与发展, 2020, 57(5): 928-953.
|
|
LAN T, PENG C, LI S, et al. An overview of monaural speech denoising and dereverberation research [J]. Journal of Computer Research and Development, 2020, 57(5):928-953.
|
[5] |
KRAWCZYK-BECKER M, GERKMANN T. Fundamental frequency informed speech enhancement in a flexible statistical framework [J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2016, 24(5): 940-951.
|
[6] |
WOHLMAYR M, STARK M, PERNKOPF F. A probabilistic interaction model for multipitch tracking with factorial hidden Markov models [J]. IEEE Transactions on Audio, Speech, and Language Processing, 2011, 19(4):799-810.
|
[7] |
MING J, SRINIVASAN R, CROOKES D. A corpus-based approach to speech enhancement from nonstationary noise [J]. IEEE Transactions on Audio, Speech, and Language Processing, 2011, 19(4):822-836.
|
[8] |
PANDEY A, WANG D. TCNN: temporal convolutional neural network for real-time speech enhancement in the time domain [C]// Proceedings of the 2019 IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway: IEEE, 2019: 6875-6879.
|
[9] |
LI A, LIU W, ZHENG C, et al. Two heads are better than one: a two-stage complex spectral mapping approach for monaural speech enhancement [J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2021, 29: 1829-1843.
|
[10] |
GAO T, DU J, XU Y, et al. Improving deep neural network based speech enhancement in low SNR environments [C]// Proceedings of the 2015 International Conference on Latent Variable Analysis and Signal Separation, LNCS 9237. Cham: Springer, 2015: 75-82.
|
[11] |
YIN D, LUO C, XIONG Z, et al. PHASEN: a phase-and-harmonics-aware speech enhancement network [C]// Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2020: 9458-9465.
|
[12] |
LEE J, KANG H G. Two-stage refinement of magnitude and complex spectra for real-time speech enhancement [J]. IEEE Signal Processing Letters, 2022, 29: 2188-2192.
|
[13] |
VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need [C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2017: 6000-6010.
|
[14] |
WANG K, HE B, ZHU W P. TSTNN: two-stage Transformer based neural network for speech enhancement in the time domain[C]// Proceedings of the 2021 IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway: IEEE, 2021: 7098-7102.
|
[15] |
YU G, LI A, WANG H, et al. DBT-Net: dual-branch federative magnitude and phase estimation with attention-in-attention Transformer for monaural speech enhancement [J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2022, 30: 2629-2644.
|
[16] |
ZHANG Q, SONG Q, NI Z, et al. Time-frequency attention for monaural speech enhancement [C]// Proceedings of the 2022 IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway: IEEE, 2022: 7852-7856.
|
[17] |
张天骐,罗庆予,张慧芝,等. 复谱映射下融合高效Transformer的语音增强方法[J]. 信号处理, 2024, 40(2): 406-416.
|
|
ZHANG T Q, LUO Q Y, ZHANG H Z, et al. Speech enhancement method based on complex spectrum mapping with efficient Transformer [J]. Journal of Signal Processing, 2024, 40(2): 406-416.
|
[18] |
HU Y, LIU Y, LV S, et al. DCCRN: deep complex convolution recurrent network for phase-aware speech enhancement [C]// Proceedings of the INTERSPEECH 2020. [S.l.]: International Speech Communication Association, 2020: 2472-2476.
|
[19] |
ZHANG S, LEI M, YAN Z, et al. Deep-FSMN for large vocabulary continuous speech recognition [C]// Proceedings of the 2018 IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway: IEEE, 2018: 5869-5873.
|
[20] |
ZHAO S, MA B, WATCHARASUPAT K N, et al. FRCRN: boosting feature representation using frequency recurrence for monaural speech enhancement [C]// Proceedings of the 2022 IEEE International Conference on Acoustics, Speech, and Signal Processing. Piscataway: IEEE, 2022: 9281-9285.
|
[21] |
HAN K, WANG Y, CHEN H, et al. A survey on Vision Transformer [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(1): 87-110.
|
[22] |
VEAUX C, YAMAGISHI J, KING S. The voice bank corpus: design, collection and data analysis of a large regional accent speech database [C]// Proceedings of the 2013 International Conference of the Oriental COCOSDA held jointly with 2013 Conference on Asian Spoken Language Research and Evaluation. Piscataway: IEEE, 2013: 1-4.
|
[23] |
THIEMANN J, ITO N, VINCENT E. The diverse environments multi-channel acoustic noise database: a database of multichannel environmental noise recordings [J]. The Journal of the Acoustical Society of America, 2013, 133(S5): No.4806631.
|
[24] |
GAROFOLO J S, LAMEL L F, FISHER W M, et al. DARPA TIMIT acoustic-phonetic continuous speech corpus CD-ROM: NISTIR 4930 [R/OL]. [2024-12-14]. .
|
[25] |
VARGA A, STEENEKEN H J M. Assessment for automatic speech recognition: Ⅱ. NOISEX-92: a database and an experiment to study the effect of additive noise on speech recognition systems [J]. Speech Communication, 1993, 12(3): 247-251.
|
[26] |
HUANG H X, WU R J, HUANG J, et al. DCCRGAN: deep complex convolution recurrent generator adversarial network for speech enhancement [C]// Proceedings of the 2022 International Symposium on Electrical, Electronics and Information Engineering. Piscataway: IEEE, 2022: 30-35.
|
[27] |
LV S, FU Y, XING M, et al. S-DCCRN: super wide band DCCRN with learnable complex feature for speech enhancement[C]// Proceedings of the 2022 IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway: IEEE, 2022: 7767-7771.
|
[28] |
ZHOU L, GAO Y, WANG Z, et al. Complex spectral mapping with attention based convolution recurrent neural network for speech enhancement [EB/OL]. [2024-12-22]. .
|
[29] |
YU G, WANG Y, WANG H, et al. A two-stage complex network using cycle-consistent generative adversarial networks for speech enhancement [J]. Speech Communication, 2021, 134: 42-54.
|
[30] |
LI Y, SUN M, ZHANG X. Scale-aware dual-branch complex convolutional recurrent network for monaural speech enhancement[J]. Computer Speech and Language, 2024, 86: No.101618.
|