| [1] |
EPHRAIM Y, MALAH D. Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator[J]. IEEE Transactions on Acoustics, Speech, and Signal Processing, 1984, 32(6): 1109-1121.
|
| [2] |
BOLL S. Suppression of acoustic noise in speech using spectral subtraction[J]. IEEE Transactions on Acoustics, Speech, and Signal Processing, 1979, 27(2): 113-120.
|
| [3] |
WILSON K W, RAJ B, SMARAGDIS P, et al. Speech denoising using nonnegative matrix factorization with priors[C]// Proceedings of the 2008 IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway: IEEE, 2008: 4029-4032.
|
| [4] |
WANG D, CHEN J. Supervised speech separation based on deep learning: an overview[J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2018, 26(10): 1702-1726.
|
| [5] |
VAN DEN OORD A, DIELEMAN S, ZEN H, et al. WaveNet: a generative model for raw audio[C]// Proceedings of the 9th ISCA Speech Synthesis Workshop. [S.l.]: International Speech Communication Association, 2016: 125.
|
| [6] |
DÉFOSSEZ A, USUNIER N, BOTTOU L, et al. Demucs: deep extractor for music sources with extra unlabeled data remixed[EB/OL]. [2025-07-18]. .
|
| [7] |
HU Y, LIU Y, LV S, et al. DCCRN: deep complex convolution recurrent network for phase-aware speech enhancement[C]// Proceedings of the INTERSPEECH 2020. [S.l.]: International Speech Communication Association, 2020: 2472-2476.
|
| [8] |
YIN D, LUO C, XIONG Z, et al. PHASEN: a phase-and-harmonics-aware speech enhancement network[C]// Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2020: 9458-9465.
|
| [9] |
ABDALLA R. Complex-valued neural networks — theory and analysis[EB/OL]. [2025-07-18]..
|
| [10] |
LU Y X, YANG A, LING Z H. MP-SENet: a speech enhancement model with parallel denoising of magnitude and phase spectra[C]// Proceedings of the INTERSPEECH 2023. [S.l.]: International Speech Communication Association, 2023: 3834-3838.
|
| [11] |
ALHUSSEIN G, ALKHODARI M, KHANDOKER A H, et al. Deep bispectral analysis of conversational speech towards emotional climate recognition[C]// Proceedings of the 2023 IEEE International Conference on Artificial Intelligence in Engineering and Technology. Piscataway: IEEE, 2023: 170-175.
|
| [12] |
WANG W, DONG L, YU Z, et al. Robust speech recognition method based on dense time-frequency convolution and bispectral refinement enhancement[J]. International Journal of Machine Learning and Cybernetics, 2025, 16(9): 5707-5725.
|
| [13] |
TAN K, WANG D. Complex spectral mapping with a convolutional recurrent network for monaural speech enhancement[C]// Proceedings of the 2019 IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway: IEEE, 2019: 6865-6869.
|
| [14] |
莫尚斌,王文君,董凌,等.基于多路信息聚合协同解码的单通道语音增强[J].计算机应用,2024,44(8):2611-2617.
|
|
MO S B, WANG W J, DONG L, et al. Single-channel speech enhancement based on multi-channel information aggregation and collaborative decoding[J]. Journal of Computer Applications, 2024, 44(8): 2611-2617.
|
| [15] |
CAO R, ABDULATIF S, YANG B. CMGAN: conformer-based metric GAN for speech enhancement[C]// Proceedings of the INTERSPEECH 2022. [S.l.]: International Speech Communication Association, 2022: 936-940.
|
| [16] |
ZHANG Z, XU S, ZHUANG X, et al. Dual branch deep interactive UNet for monaural noisy-reverberant speech enhancement[J]. Applied Acoustics, 2023, 212: No.109574.
|
| [17] |
SU Y, LIU Y, YANG C, et al. MN-Net: multi-scale feature fusion and neighborhood attention self-supervised network for industrial spool surface anomaly detection[C]// Proceedings of the IEEE 36th International Conference on Tools with Artificial Intelligence. Piscataway: IEEE, 2024: 282-289.
|
| [18] |
NIKIAS C L, MENDEL J M. Signal processing with higher-order spectra[J]. IEEE Signal Processing Magazine, 1993, 10(3): 10-37.
|
| [19] |
RANGOUSSI M, CARAYANNIS G. Adaptive detection of noisy speech using third-order statistics[J]. International Journal of Adaptive Control and Signal Processing, 1996, 10(2/3): 113-136.
|
| [20] |
HIRLEKAR S G, HOLAMBE R S, BASU T K. Phase recovery from bispectrum[J]. IETE Journal of Research, 2000, 46(3): 139-145.
|
| [21] |
LAVANYA T, VIJAYALAKSHMI P, MRINALINI K, et al. Higher order statistics-driven magnitude and phase spectrum estimation for speech enhancement[J]. Computer Speech and Language, 2024, 87: No.101639.
|
| [22] |
PANDEY A, WANG D. Densely connected neural network with dilated convolutions for real-time speech enhancement in the time domain[C]// Proceedings of the 2020 IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway: IEEE, 2020: 6629-6633.
|
| [23] |
ULYANOV D, VEDALDI A, LEMPITSKY V. Instance normalization: the missing ingredient for fast stylization[EB/OL]. [2025-02-18]..
|
| [24] |
HE K, ZHANG X, REN S, et al. Delving deep into rectifiers: surpassing human-level performance on ImageNet classification[C]// Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2015: 1026-1034.
|
| [25] |
FU S W, YU C, HSIEH T A, et al. MetricGAN+: an improved version of MetricGAN for speech enhancement[C]// Proceedings of the INTERSPEECH 2021. [S.l.]: International Speech Communication Association, 2021: 201-205.
|
| [26] |
YANG A, LING Z H. Neural speech phase prediction based on parallel estimation architecture and anti-wrapping losses[C]// Proceedings of the 2023 IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway: IEEE, 2023: 1-5.
|
| [27] |
VALENTINI-BOTINHAO C, WANG X, TAKAKI S, et al. Investigating RNN-based speech enhancement methods for noise-robust text-to-speech[C]// Proceedings of the 9th ISCA Speech Synthesis Workshop. [S.l.]: International Speech Communication Association, 2016: 146-152.
|
| [28] |
VEAUX C, YAMAGISHI J, KING S. The voice bank corpus: design, collection and data analysis of a large regional accent speech database[C]// Proceedings of the 2013 International Conference on Oriental COCOSDA held jointly with 2013 Conference on Asian Spoken Language Research and Evaluation. Piscataway: IEEE, 2013: 1-4.
|
| [29] |
THIEMANN J, ITO N, VINCENT E. The Diverse Environments Multi-channel Acoustic Noise Database (DEMAND): a database of multichannel environmental noise recordings[J]. Proceedings of Meetings on Acoustics, 2013, 19(1): No.035081.
|
| [30] |
LOSHCHILOV I, HUTTER F. Decoupled weight decay regularization[EB/OL]. [2025-01-09]..
|
| [31] |
PASCUAL S, BONAFONTE A, SERRÀ J. SEGAN: speech enhancement generative adversarial network[C]// Proceedings of the INTERSPEECH 2017. [S.l.]: International Speech Communication Association, 2017: 3642-3646.
|
| [32] |
KIM E, SEO H. SE-Conformer: time-domain speech enhancement using conformer[C]// Proceedings of the INTERSPEECH 2021. [S.l.]: International Speech Communication Association, 2021: 2736-2740.
|
| [33] |
FU S W, LIAO C F, TSAO Y, et al. MetricGAN: generative adversarial networks based black-box metric scores optimization for speech enhancement[C]// Proceedings of the 36th International Conference on Machine Learning. New York: JMLR.org, 2019: 2031-2041.
|
| [34] |
YIN D, ZHAO Z, TANG C, et al. TridentSE: guiding speech enhancement with 32 global tokens[C]// Proceedings of the INTERSPEECH 2023. [S.l.]: International Speech Communication Association, 2023: 3839-3843.
|
| [35] |
CHAO R, CHENG W H, LA QUATRA M, et al. An investigation of incorporating mamba for speech enhancement[C]// Proceedings of the 2024 IEEE Spoken Language Technology Workshop. Piscataway: IEEE, 2024: 302-308.
|