| 1 | 
																						 
											陈晨, 韩纪庆, 陈德运, 等.文本无关说话人识别中句级特征提取方法研究综述[J].自动化学报, 2022, 48(3): 664-688.
																						 | 
										
																													
																							 | 
																						 
											CHEN C, HAN J Q, CHEN D Y,et al. Utterance-level feature extraction in text-independent speaker recognition: a review[J]. Acta Automatica Sinica, 2022, 48(3): 664-688.
																						 | 
										
																													
																							| 2 | 
																						 
											VARIANI E, LEI X, McDERMOTT E, et al. Deep neural networks for small footprint text-dependent speaker verification[C]// Proceedings of the 2014 IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway: IEEE, 2014: 4052-4056.
																						 | 
										
																													
																							| 3 | 
																						 
											SNYDER D, GARCIA-ROMERO D, SELL G, et al. X-vectors: robust DNN embeddings for speaker recognition[C]// Proceedings of the 2018 IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway: IEEE, 2018: 5329-5333.
																						 | 
										
																													
																							| 4 | 
																						 
											CHUNG J S, HUH J,MUN S, et al. In defence of metric learning for speaker recognition[EB/OL]. (2020-03-26) [2023-08-01]. .
																						 | 
										
																													
																							| 5 | 
																						 
											DESPLANQUES B, THIENPONDT J, DEMUYNCK K. ECAPA-TDNN: emphasized channel attention, propagation and aggregation in TDNN based speaker verification[EB/OL]. (2020-05-14) [2023-08-01]. .
																						 | 
										
																													
																							| 6 | 
																						 
											THIENPONDT J, DESPLANQUES B, DEMUYNCK K. Integrating frequency translational invariance in TDNNs and frequency positional information in 2D ResNets to enhance speaker verification[EB/OL]. (2021-04-06) [2023-08-01]..
																						 | 
										
																													
																							| 7 | 
																						 
											ZHAO M, MA Y, LIU M, et al. The SpeakIn system for VoxCeleb Speaker Recognition Challange 2021[EB/OL]. (2021-09-05) [2023-08-01]. .
																						 | 
										
																													
																							| 8 | 
																						 
											WAN Z-K, REN Q-H, QIN Y-C, et al. Statistical pyramid dense time delay neural network for speaker verification[C]// Proceedings of the 2022 IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway: IEEE, 2022: 7532-7536.
																						 | 
										
																													
																							| 9 | 
																						 
											HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 770-778.
																						 | 
										
																													
																							| 10 | 
																						 
											GAO S-H, CHENG M-M, ZHAO K, et al. Res2Net: a new multi scale backbone architecture[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(2): 652- 662.
																						 | 
										
																													
																							| 11 | 
																						 
											陈志高,李鹏,肖润秋,等.文本无关说话人识别的一种多尺度特征提取方法[J]. 电子与信息学报, 2021, 43(11): 3266-3271.
																						 | 
										
																													
																							 | 
																						 
											CEHN Z G, LI P, XIAO R Q,et al. A multi-scale feature extraction method for text-independent speaker recognition[J]. Journal of Electronics & Information Technology, 2021, 43(11): 3266-3271.
																						 | 
										
																													
																							| 12 | 
																						 
											邓力洪,邓飞,张葛祥,等.改进 Res2Net的多尺度端到端说话人识别系统[J]. 计算机工程与应用, 2023, 59(24): 110-120.
																						 | 
										
																													
																							 | 
																						 
											DENG L H, DENG F, ZHANG G X, et al. Multi-scale end-to-end speaker recognition system based on improved Res2Net[J]. Computer Engineering and Applications, 2023, 59(24): 110-120.
																						 | 
										
																													
																							| 13 | 
																						 
											WANG X, XUE F, WANG W, et al. A network model of speaker identification with new feature extraction methods and asymmetric BLSTM[J]. Neurocomputing, 2020, 403: 167-181.
																						 | 
										
																													
																							| 14 | 
																						 
											ABRAHAM J V T, KHAN A N, SHAHINA A. A deep learning approach for robust speaker identification using chroma energy normalized statistics and Mel frequency cepstral coefficients[J]. International Journal of Speech Technology, 2023, 26: 579-587.
																						 | 
										
																													
																							| 15 | 
																						 
											LIU T, DAS R K, LEE K A, et al. MFA: TDNN with multi-scale frequency-channel attention for text-independent speaker verification with short utterances[C]// Proceedings of the 2022 IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway: IEEE, 2022: 7517-7521.
																						 | 
										
																													
																							| 16 | 
																						 
											DAI Y, GIESEKE F, OEHMCKE S, et al. Attentional feature fusion[C]// Proceedings of the 2021 IEEE Winter Conference on Applications of Computer Vision. Piscataway: IEEE, 2021: 3559-3568.
																						 | 
										
																													
																							| 17 | 
																						 
											PARK D S, CHAN W, ZHANG Y, et al. SpecAugment: a simple data augmentation method for automatic speech recognition[EB/OL]. (2019-08-18) [2023-08-01]. .
																						 | 
										
																													
																							| 18 | 
																						 
											SNYDER D, CHEN G, POVEY D. MUSAN: a music, speech, and noise corpus[EB/OL]. (2015-10-28) [2023-08-01]. .
																						 | 
										
																													
																							| 19 | 
																						 
											JUNG J-W, KIM Y J, H-S HEO, et al. Pushing the limits of raw waveform speaker recognition[J]. (2022-03-16) [2023-08-01]. .
																						 |