[1] KINNUNEN T, LI H. An overview of text-independent speaker recognition:from features to supervectors[J]. Speech Communication, 2010, 52(1):12-40. [2] DEHAK N, KENNY P J, DEHAK R, et al. Front-end factor analysis for speaker verification[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2011, 19(4):788-798. [3] LI C, MA X, JIANG B, et al. Deep speaker:an end-to-end neural speaker embedding system[EB/OL].[2019-01-10]. https://arxiv.org/pdf/1705.02304.pdf. [4] LEI Y, SCHEFFER N, FERRER L, et al. A novel scheme for speaker recognition using a phonetically-aware deep neural network[C]//Proceedings of the 2014 IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway:IEEE, 2014:1695-1699. [5] FU T, QIAN Y, LIU Y, et al. Tandem deep features for text-dependent speaker verification[EB/OL].[2019-01-10]. https://www.isca-speech.org/archive/archive_papers/interspeech_2014/i14_1327.pdf. [6] TIAN Y, CAI M, HE L, et al. Investigation of bottleneck features and multilingual deep neural networks for speaker verification[EB/OL].[2019-01-10]. https://www.isca-speech.org/archive/interspeech_2015/papers/i15_1151.pdf. [7] VARIANI E, LEI X, McDERMOTT E, et al. Deep neural networks for small footprint text-dependent speaker verification[C]//Proceedings of the 2014 IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway:IEEE, 2014:4052-4056. [8] CAI W, CHEN J, LI M. Analysis of length normalization in end-to-end speaker verification system[EB/OL].[2019-01-10]. https://arxiv.org/pdf/1806.03209.pdf. [9] 王昕, 张洪冉. 基于DNN处理的鲁棒性I-Vector说话人识别算法[J]. 计算机工程与应用, 2018, 54(22):167-172. (WANG X, ZHANG H R. Robust i-vector speaker recognition method based on DNN processing[J]. Computer Engineering and Applications, 2018, 54(22):167-172.) [10] LIU W, WEN Y, YU Z, et al. SphereFace:deep hypersphere embedding for face recognition[C]//Proceedings of the IEEE 2017 Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2017:6738-6746. [11] HEIGOLD G, MORENO I, BENGIO S, et al. End-to-end text-dependent speaker verification[C]//Proceedings of the 2016 IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway:IEEE, 2016:5115-5119. [12] SNYDER D, GHAHREMANI P, POVEY D, et al. Deep neural network-based speaker embeddings for end-to-end speaker verification[C]//Proceedings of the 2016 IEEE Spoken Language Technology Workshop. Piscataway:IEEE, 2016:165-170. [13] ZHANG Y, PEZESHKI M, BRAKEL P, et al. Towards end-to-end speech recognition with deep convolutional neural networks[EB/OL].[2019-01-10]. https://arxiv.org/pdf/1701.02720.pdf. [14] ZHANG C, KOISHIDA K. End-to-end text-independent speaker verification with triplet loss on short utterances[EB/OL].[2019-01-10]. https://www.isca-speech.org/archive/Interspeech_2017/pdfs/1608.PDF. [15] WEN Y, ZHANG K, LI Z, et al. A discriminative feature learning approach for deep face recognition[C]//Proceedings of the 2016 European Conference on Computer Vision, LNCS 9911. Cham:Springer, 2016:499-515. [16] LIU W, WEN Y, YU Z, et al. Large-margin softmax loss for convolutional neural networks[EB/OL].[2019-01-10]. https://arxiv.org/pdf/1612.02295.pdf. [17] HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2016:770-778. [18] CHUNG J S, NAGRANI A, ZISSERMAN A. VoxCeleb2:deep speaker recognition[EB/OL].[2019-01-10]. https://arxiv.org/pdf/1806.05622.pdf. [19] NAGRANI A, CHUNG J S, ZISSERMAN A. VoxCeleb:a large-scale speaker identification dataset[EB/OL].[2019-01-10]. https://arxiv.org/pdf/1706.08612.pdf. |