1 |
张卫强,宋贝利,蔡猛,等.基于音素后验概率的样例语音关键词检测方法[J].天津大学学报(自然科学与工程技术版), 2015, 48(9): 757-760.
|
|
ZHANG W Q, SONG B L, CAI M, et al. A query-by-example spoken term detection method based on phonetic posteriorgram [J]. Journal of Tianjin University (Science and Technology), 2015, 48(9): 757-760.
|
2 |
HAZEN T J, SHEN W, WHITE C. Query-by-example spoken term detection using phonetic posteriorgram templates [C]// Proceedings of the 2009 IEEE Workshop on Automatic Speech Recognition and Understanding. Piscataway: IEEE, 2009: 421-426. 10.1109/asru.2009.5372889
|
3 |
ZHANG Y, GLASS J R. Unsupervised spoken keyword spotting via segmental DTW on Gaussian posteriorgrams [C]// Proceedings of the 2009 IEEE Workshop on Automatic Speech Recognition and Understanding. Piscataway: IEEE, 2009: 398-403. 10.1109/asru.2009.5372931
|
4 |
MANTEENA G, ANGUERA X. Speed improvements to information retrieval-based dynamic time warping using hierarchical K-Means clustering [C]// Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway: IEEE, 2013: 8515-8519. 10.1109/icassp.2013.6639327
|
5 |
LEUNG C C, WANG L, XU H, et al. Toward high-performance language-independent query-by-example spoken term detection for MediaEval 2015: post-evaluation analysis [C]// Proceedings of the INTERSPEECH 2016. [S.l.]: International Speech Communication Association, 2016: 3703-3707. 10.21437/interspeech.2016-691
|
6 |
LEVIN K, HENRY K, JANSEN A, et al. Fixed-dimensional acoustic embeddings of variable-length segments in low-resource settings [C]// Proceedings of the 2013 IEEE Workshop on Automatic Speech Recognition and Understanding. Piscataway: IEEE, 2013: 410-415. 10.1109/asru.2013.6707765
|
7 |
LEVIN K, JANSEN A, VAN DURME B. Segmental acoustic indexing for zero resource keyword search [C]// Proceedings of the 2015 IEEE International Conference on Acoustics, Speech, and Signal Processing. Piscataway: IEEE, 2015: 5828-5832. 10.1109/icassp.2015.7179089
|
8 |
SHEN F, DU C, YU K. Acoustic word embeddings for end-to-end speech synthesis [J]. Applied Sciences, 2021, 11(19): No.9010. 10.3390/app11199010
|
9 |
SHI B, SETTLE S, LIVESCU K. Whole-word segmental speech recognition with acoustic word embeddings [C]// Proceedings of the 2021 IEEE Spoken Language Technology Workshop. Piscataway: IEEE, 2021: 164-171. 10.1109/slt48900.2021.9383578
|
10 |
KAMPER H. Truly unsupervised acoustic word embeddings using weak top-down constraints in encoder-decoder models [C]// Proceedings of the 2019 IEEE International Conference on Acoustics, Speech, and Signal Processing. Piscataway: IEEE, 2019: 6535-6539. 10.1109/icassp.2019.8683639
|
11 |
KAMPER H, WANG W, LIVESCU K. Deep convolutional acoustic word embeddings using word-pair side information [C]// Proceedings of the 2016 IEEE International Conference on Acoustics, Speech, and Signal Processing. Piscataway: IEEE, 2016: 4950-4954. 10.1109/icassp.2016.7472619
|
12 |
HUANG J, GHARBIEH W, SHIM H S, et al. Query-by-example keyword spotting system using multi-head attention and soft-triple loss [C]// Proceedings of the 2021 IEEE International Conference on Acoustics, Speech, and Signal Processing. Piscataway: IEEE, 2021: 6858-6862. 10.1109/icassp39728.2021.9414156
|
13 |
SETTLE S, LIVESCU K. Discriminative acoustic word embeddings: recurrent neural network-based approaches [C]// Proceedings of the 2016 IEEE Spoken Language Technology Workshop. Piscataway: IEEE, 2016: 503-510. 10.1109/slt.2016.7846310
|
14 |
CHEN G, PARADA C, SAINATH T N. Query-by-example keyword spotting using long short-term memory networks [C]// Proceedings of the 2015 IEEE International Conference on Acoustics, Speech, and Signal Processing. Piscataway: IEEE, 2015: 5236-5240. 10.1109/icassp.2015.7178970
|
15 |
YUAN Y, LV Z, HUANG S, et al. Verifying deep keyword spotting detection with acoustic word embeddings [C]// Proceedings of the 2019 IEEE Automatic Speech Recognition and Understanding Workshop. Piscataway: IEEE, 2019: 613-620. 10.1109/asru46091.2019.9003781
|
16 |
YUAN Y, XIE L, LEUNG C C, et al. Fast query-by-example speech search using attention-based deep binary embeddings [J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2020, 28: 1988-2000. 10.1109/taslp.2020.2998277
|
17 |
AO C W, LEE H Y. Query-by-example spoken term detection using attention-based multi-hop networks [C]// Proceedings of the 2018 IEEE International Conference on Acoustics, Speech, and Signal Processing. Piscataway: IEEE, 2018: 6264-6268. 10.1109/icassp.2018.8462570
|
18 |
ZHANG K, WU Z, JIA J, et al. Query-by-example spoken term detection using attentive pooling networks [C]// Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference. Piscataway: IEEE, 2019: 1267-1272. 10.1109/apsipaasc47483.2019.9023023
|
19 |
RAM D, MICULICICH L, BOURLARD H. CNN based query by example spoken term detection [C]// Proceedings of the INTERSPEECH 2018. [S.l.]: International Speech Communication Association, 2018: 92-96. 10.21437/interspeech.2018-1722
|
20 |
RAM D, MICULICICH L, BOURLARD H. Neural network based end-to-end query by example spoken term detection [J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2020, 28: 1416-1427. 10.1109/taslp.2020.2988788
|
21 |
NAIK P, GAONKAR M N, THENKANIDIYOOR V, et al. Kernel based matching and a novel training approach for CNN-based QbE-STD [C]// Proceedings of the 2020 International Conference on Signal Processing and Communications. Piscataway: IEEE, 2020: 1-5. 10.1109/spcom50965.2020.9179588
|
22 |
HU J, SHEN L, SUN G. Squeeze-and-excitation networks [C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 7132-7141. 10.1109/cvpr.2018.00745
|
23 |
YUAN Y, LEUNG C C, XIE L, et al. Query-by-example speech search using recurrent neural acoustic word embeddings with temporal context [J]. IEEE Access, 2019, 7: 67656-67665. 10.1109/access.2019.2918638
|
24 |
JACOBS C, MATUSEVYCH Y, KAMPER H. Acoustic word embeddings for zero-resource languages using self-supervised contrastive learning and multilingual adaptation [C]// Proceedings of the 2021 IEEE Spoken Language Technology Workshop. Piscataway: IEEE, 2021: 919-926. 10.1109/slt48900.2021.9383594
|
25 |
ZHANG Y, PARK D S, HAN W, et al. BigSSL: exploring the frontier of large-scale semi-supervised learning for automatic speech recognition [J]. IEEE Journal of Selected Topics in Signal Processing, 2022, 16(6): 1519-1532. 10.1109/jstsp.2022.3182537
|
26 |
YANG Z, HIRSCHBERG J. Linguistically-informed training of acoustic word embeddings for low-resource languages [C]// Proceedings of the INTERSPEECH 2019. [S.l.]: International Speech Communication Association, 2019: 2678-2682. 10.21437/interspeech.2019-3119
|
27 |
SHITOV D, PIROGOVA E, WYSOCKI T A, et al. Learning acoustic word embeddings with dynamic time warping triplet networks [J]. IEEE Access, 2020, 8: 103327-103338. 10.1109/access.2020.2999055
|
28 |
LI Z, WU L, LI T, et al. Improves neural acoustic word embeddings query by example spoken term detection with Wav2Vec pretraining and circle loss [C]// Proceedings of the 12th International Symposium on Chinese Spoken Language Processing. Piscataway: IEEE, 2021: 1-5. 10.1109/iscslp49672.2021.9362065
|