Journal of Computer Applications ›› 2025, Vol. 45 ›› Issue (4): 1190-1198.DOI: 10.11772/j.issn.1001-9081.2024030331
• Artificial intelligence • Previous Articles Next Articles
Tianyu LIU, Ye TAO(), Chaofeng LU, Jiawang LIU
Received:
2024-03-25
Revised:
2024-04-29
Accepted:
2024-05-06
Online:
2024-06-04
Published:
2025-04-10
Contact:
Ye TAO
About author:
LIU Tianyu, born in 1999, M. S. candidate. Her research interests include natural language processing, speaker identification.Supported by:
通讯作者:
陶冶
作者简介:
刘天宇(1999—),女,山东济宁人,硕士研究生,主要研究方向:自然语言处理、说话人识别基金资助:
CLC Number:
Tianyu LIU, Ye TAO, Chaofeng LU, Jiawang LIU. Novel speaker identification framework based on narrative unit and reliable label[J]. Journal of Computer Applications, 2025, 45(4): 1190-1198.
刘天宇, 陶冶, 鲁超峰, 刘家旺. 融合叙事单元和可靠标签的小说说话人识别框架[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1190-1198.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2024030331
数据集 | 训练集 样本数 | 验证集 样本数 | 测试集 样本数 | 书籍 数 | 全部 句数 | 对话 句数 | 对话 占比/% |
---|---|---|---|---|---|---|---|
WP | 2 000 | 298 | 298 | 1 | 15 171 | 2 596 | 17.11 |
JY | 17 159 | 5 719 | 5 719 | 3 | 69 043 | 28 597 | 41.42 |
CNSI | 20 204 | 6 658 | 6 658 | 11 | 89 705 | 26 862 | 29.94 |
Tab. 1 Information of datasets
数据集 | 训练集 样本数 | 验证集 样本数 | 测试集 样本数 | 书籍 数 | 全部 句数 | 对话 句数 | 对话 占比/% |
---|---|---|---|---|---|---|---|
WP | 2 000 | 298 | 298 | 1 | 15 171 | 2 596 | 17.11 |
JY | 17 159 | 5 719 | 5 719 | 3 | 69 043 | 28 597 | 41.42 |
CNSI | 20 204 | 6 658 | 6 658 | 11 | 89 705 | 26 862 | 29.94 |
方法 | 预训练模型 | 规则使用情况 | WP | JY | CNSI |
---|---|---|---|---|---|
Random | 无 | 无 | 37.6 | 33.7 | 42.7 |
Rule | 无 | 纯规则 | 72.1 | 86.6 | 78.4 |
SVM | 无 | 基于规则的特征 | 61.0 | 94.5 | — |
MLP | 无 | 基于规则的特征 | 70.5 | 95.6 | — |
CSN | BERT-base | 后处理 | 82.5 | 97.4 | 89.1 |
ChatGPT-3.5-turbo | GPT-3.5 | 无 | 83.8 | 97.9 | 90.7 |
E2E_SI | RoBERTa-wwm-large | 无 | 80.9 | 98.3 | 86.1 |
SSN | BERT-wwm-ext | 无 | 83.5 | 98.1 | 90.9 |
SSN | BERT-base | 无 | 84.3 | 98.3 | 91.3 |
Tab. 2 Comparison of accuracies of various methods on different datasets
方法 | 预训练模型 | 规则使用情况 | WP | JY | CNSI |
---|---|---|---|---|---|
Random | 无 | 无 | 37.6 | 33.7 | 42.7 |
Rule | 无 | 纯规则 | 72.1 | 86.6 | 78.4 |
SVM | 无 | 基于规则的特征 | 61.0 | 94.5 | — |
MLP | 无 | 基于规则的特征 | 70.5 | 95.6 | — |
CSN | BERT-base | 后处理 | 82.5 | 97.4 | 89.1 |
ChatGPT-3.5-turbo | GPT-3.5 | 无 | 83.8 | 97.9 | 90.7 |
E2E_SI | RoBERTa-wwm-large | 无 | 80.9 | 98.3 | 86.1 |
SSN | BERT-wwm-ext | 无 | 83.5 | 98.1 | 90.9 |
SSN | BERT-base | 无 | 84.3 | 98.3 | 91.3 |
[m,n] | 上下文平均字数 | 平均候选人数 | 准确率/% |
---|---|---|---|
[-5,+5] | 268 | 3.28 | 89.8 |
[-10,+10] | 552 | 4.32 | 89.2 |
NUCS | 176 | 2.86 | 91.3 |
Tab. 3 Comparison of different context determination methods on CNSI dataset
[m,n] | 上下文平均字数 | 平均候选人数 | 准确率/% |
---|---|---|---|
[-5,+5] | 268 | 3.28 | 89.8 |
[-10,+10] | 552 | 4.32 | 89.2 |
NUCS | 176 | 2.86 | 91.3 |
方法 | 不同DL下的准确率/% | ||||
---|---|---|---|---|---|
DL=1 000 | DL=2 000 | DL=5 000 | DL=10 000 | DL=20 000 | |
SSN | 85.9 | 86.7 | 88.3 | 89.8 | 91.3 |
SSN+ST | 86.1 | 87.2 | 89.1 | 90.4 | 91.4 |
SSN+ST+RPLS | 86.5 | 87.5 | 89.7 | 90.8 | 91.7 |
Tab. 4 Results of ablation experiments
方法 | 不同DL下的准确率/% | ||||
---|---|---|---|---|---|
DL=1 000 | DL=2 000 | DL=5 000 | DL=10 000 | DL=20 000 | |
SSN | 85.9 | 86.7 | 88.3 | 89.8 | 91.3 |
SSN+ST | 86.1 | 87.2 | 89.1 | 90.4 | 91.4 |
SSN+ST+RPLS | 86.5 | 87.5 | 89.7 | 90.8 | 91.7 |
1 | ZHANG J Y, BLACK A W, SPROAT R. Identifying speakers in children’s stories for speech synthesis [C]// Proceedings of the 8th European Conference on Speech Communication and Technology. [S.l.]: ISCA, 2003: 2041-2044. |
2 | GREENE E, MISHRA T, HAFFNER P, et al. Predicting character-appropriate voices for a TTS-based storyteller system [C]// Proceedings of the INTERSPEECH 2012. [S.l.]: ISCA, 2012: 2210-2213. |
3 | PAN J, WU L, YIN X, et al. A chapter-wise understanding system for text-to-speech in Chinese novels [C]// Proceedings of the 2021 IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway: IEEE, 2021: 6069-6073. |
4 | 陈田,蔡从虎,袁晓辉,等. 基于多尺度卷积和自注意力特征融合的多模态情感识别方法[J]. 计算机应用, 2024, 44(2):369-376. |
CHEN T, CAI C H, YUAN X H, et al. Multimodal emotion recognition method based on multiscale convolution and self-attention feature fusion [J]. Journal of Computer Applications, 2024, 44(2): 369-376. | |
5 | POPOV V, VOVK I, GOGORYAN V, et al. Grad-TTS: a diffusion probabilistic model for text-to-speech [C]// Proceedings of the 38th International Conference on Machine Learning. New York: JMLR.org, 2021: 8599-8608. |
6 | YU D, SUN K, CARDIE C, et al. Dialogue-based relation extraction [C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2022: 4927-4940. |
7 | JIANG Y, XU Y, ZHAN Y, et al. The CRECIL corpus: a new dataset for extraction of relations between characters in Chinese multi-party dialogues [C]// Proceedings of the 13th Language Resources and Evaluation Conference. Paris: European Language Resources Association, 2022: 2337-2344. |
8 | ELSNER M. Character-based kernels for novelistic plot structure[C]// Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics. Stroudsburg: ACL, 2012: 634-644. |
9 | CHEN R H G, CHEN C C, CHEN C M. Unsupervised cluster analyses of character networks in fiction: community structure and centrality [J]. Knowledge-Based Systems, 2019, 163: 800-810. |
10 | BINGENHEIMER M, HUNG J J, WILES S. Social network visualization from TEI data [J]. Literary and Linguistic Computing, 2011, 26(3): 271-278. |
11 | RYDBERG C J. Social networks and the language of Greek tragedy[J]. Journal of the Chicago Colloquium on Digital Humanities and Computer Science, 2011, 1(3): 1-11. |
12 | AGARWAL A, CORVALAN A, JENSEN J, et al. Social network analysis of Alice in Wonderland [C]// Proceedings of the NAACL-HLT 2012 Workshop on Computational Linguistics for Literature. Stroudsburg: ACL, 2012: 88-96. |
13 | JUNG J J, YOU E, PARK S B. Emotion-based character clustering for managing story-based contents: a cinemetric analysis[J]. Multimedia Tools and Applications, 2013, 65: 29-45. |
14 | LI J, ZHANG C, TAN H, et al. Complex networks of characters in fictional novels [C]// Proceedings of the IEEE/ACIS 18th International Conference on the Computer and Information Science. Piscataway: IEEE, 2019: 417-420. |
15 | GLASS K, BANGAY S. A naïve salience-based method for speaker identification in fiction books [EB/OL]. [2023-12-13].. |
16 | SARMENTO L, NUNES S. Automatic extraction of quotes and topics from news feeds [EB/OL]. [2023-12-05]. . |
17 | PARK T, KIM S H. Novel character identification utilizing semantic relation with animate nouns in Korean[J]. ACM Transactions on Asian and Low-Resource Language Information Processing, 2018, 17(4): 1-17. |
18 | ELSON D K, McKEOWN K R. Automatic attribution of quoted speech in literary narrative [C]// Proceedings of the 24th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2010: 1013-1019. |
19 | HE H, BARBOSA D, KONDRAK G. Identification of speakers in novels [C]// Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg: ACL, 2013: 1312-1320. |
20 | MUZNY G, FANG M, CHANG A, et al. A two-stage sieve approach for quote attribution [C]// Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers. Stroudsburg: ACL, 2017: 460-470. |
21 | SHAHIN I. Identifying speakers using their emotion cues [J]. International Journal of Speech Technology, 2011, 14(2): 89-98. |
22 | O’KEEFE T, PARETI S, CURRAN J R, et al. A sequence labelling approach to quote attribution [C]// Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Stroudsburg: ACL, 2012: 790-799. |
23 | IOSIF E, MISHRA T. From speaker identification to affective analysis: a multi-step system for analyzing children’s stories [C]// Proceedings of the 3rd Workshop on Computational Linguistics for Literature. Stroudsburg: ACL, 2014: 40-49. |
24 | YEUNG C Y, LEE J. Identifying speakers and listeners of quoted speech in literary works [C]// Proceedings of the 8th Conference of the International Joint Conference on Natural Language Processing (Volume 2: Short Papers). Stroudsburg: ACL, 2017: 325-329. |
25 | CHEN J X, LING Z H, DAI L R. A Chinese dataset for identifying speakers in novels [C]// Proceedings of the INTERSPEECH 2019. [S.l.]: ISCA, 2019: 1561-1565. |
26 | JIA Y, DOU H, CAO S, et al. Speaker identification and its application to social network construction for Chinese novels [C]// Proceedings of the 2020 International Conference on Asian Language Processing. Piscataway: IEEE, 2020: 13-18. |
27 | CHAGANTY A, MUZNY G. Quote attribution for literary text with neural networks [EB/OL]. [2023-10-13]. . |
28 | CHEN Y, LING Z H, LIU Q F. A neural-network-based approach to identifying speakers in novels [C]// Proceedings of the INTERSPEECH 2021. [S.l.]: ISCA, 2021: 4114-4118. |
29 | YU D, ZHOU B, YU D. End-to-end Chinese speaker identification [C]// Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg: ACL, 2022: 2274-2285. |
30 | ZHANG Y, LIU Y. DirectQuote: a dataset for direct quotation extraction and attribution in news articles [C]// Proceedings of the 13th Language Resources and Evaluation Conference. Paris: European Language Resources Association, 2022: 6959-6966. |
31 | ZHOU B, YU D, YU D, et al. Cross-lingual speaker identification using distant supervision [EB/OL]. [2023-03-07].. |
32 | SU Z, XU L, XU J, et al. SIG: speaker identification in literature via prompt-based generation [C]// Proceedings of the 38th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2024: 19035-19043. |
33 | CHEN Y, HE T, ZHOU H, et al. Symbolization, prompt, and classification: a framework for implicit speaker identification in novels [C]// Findings of the Association for Computational Linguistics: EMNLP 2023. Stroudsburg: ACL, 2023: 3455-3467. |
34 | DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional Transformers for language under-standing [C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Stroudsburg: ACL, 2019: 4171-4186. |
35 | LIU Y, OTT M, GOYAL N, et al. RoBERTa: a robustly optimized BERT pretraining approach [EB/OL]. [2024-03-09].. |
36 | BROWN T B, MANN B, RYDER N, et al. Language models are few-shot learners [C]// Proceedings of the 34th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2020: 1877-1901. |
37 | PLATT J C. Sequential minimal optimization: a fast algorithm for training support vector machines [EB/OL]. [2023-06-14]. . |
38 | ROSENBLATT F. The perceptron: a probabilistic model for information storage and organization in the brain [J]. Psychological Review, 1958, 65(6): 386-408. |
39 | XU L, HU H, ZHANG X, et al. CLUE: a Chinese language understanding evaluation benchmark [C]// Proceedings of the 28th International Conference on Computational Linguistics. Stroudsburg: ACL, 2020: 4762-4772. |
40 | ROTH D. Incidental supervision: moving beyond supervised learning [C]// Proceedings of the 31st AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2017: 4885-4890. |
41 | LEE J, YEUNG C Y. An annotated corpus of direct speech [C]// Proceedings of the 10th International Conference on Language Resources and Evaluation. Paris: European Language Resources Association, 2016: 1059-1063. |
42 | EK A, WIRÉN M, ÖSTLING R, et al. Identifying speakers and addressees in dialogues extracted from literary fiction [C]// Proceedings of the 11th International Conference on Language Resources and Evaluation. Paris: European Language Resources Association, 2018: 817-824. |
43 | PAPAY S, PADÓ S. RiQuA: a corpus of rich quotation annotation for English literary text [C]// Proceedings of the 12th Language Resources and Evaluation Conference. Paris: European Language Resources Association, 2020: 835-841. |
44 | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need [C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2017: 6000-6010. |
45 | PATEL A, LI B, RASOOLI M S, et al. Bidirectional language models are also few-shot learners [EB/OL]. [2023-03-14].. |
[1] | Yuanlong WANG, Tinghua LIU, Hu ZHANG. Commonsense question answering model based on cross-modal contrastive learning [J]. Journal of Computer Applications, 2025, 45(3): 732-738. |
[2] | Weichao DANG, Yinghao FAN, Gaimei GAO, Chunxia LIU. Weakly supervised action localization based on temporal and global contextual feature enhancement [J]. Journal of Computer Applications, 2025, 45(3): 963-971. |
[3] | Hongye LIU, Xiai CHEN, Tao ZENG. Tri-modal adapter based on selective state space [J]. Journal of Computer Applications, 2025, 45(2): 411-420. |
[4] | Yuchen HONG, Jinlong LI. Symbolic music generation with pre-training [J]. Journal of Computer Applications, 2025, 45(2): 578-583. |
[5] | Shang LIU, Yuwei ZHOU, Rao DAI, Linfang DONG, Meng LIU. Small target detection algorithm in remote sensing images integrating attention and contextual information [J]. Journal of Computer Applications, 2025, 45(1): 292-300. |
[6] | Liang ZHU, Jingzhe MU, Hongqiang ZUO, Jingzhong GU, Fubao ZHU. Location privacy-preserving recommendation scheme based on federated graph neural network [J]. Journal of Computer Applications, 2025, 45(1): 136-143. |
[7] | Yu DU, Yan ZHU. Constructing pre-trained dynamic graph neural network to predict disappearance of academic cooperation behavior [J]. Journal of Computer Applications, 2024, 44(9): 2726-2731. |
[8] | Zhiqiang ZHAO, Peihong MA, Xinhong HEI. Crowd counting method based on dual attention mechanism [J]. Journal of Computer Applications, 2024, 44(9): 2886-2892. |
[9] | Chunxue ZHANG, Liqing QIU, Cheng’ai SUN, Caixia JING. Purchase behavior prediction model based on two-stage dynamic interest recognition [J]. Journal of Computer Applications, 2024, 44(8): 2365-2371. |
[10] | Kaipeng XUE, Tao XU, Chunjie LIAO. Multimodal sentiment analysis network with self-supervision and multi-layer cross attention [J]. Journal of Computer Applications, 2024, 44(8): 2387-2392. |
[11] | Caiqin WANG, Yuhao ZHOU, Shunxiang ZHANG, Yanhui WANG, Xiaolong WANG. Aspect-opinion pair extraction of new energy vehicle complaint text based on context enhancement [J]. Journal of Computer Applications, 2024, 44(8): 2430-2436. |
[12] | Hao CHAO, Shuqi FENG, Yongli LIU. Convolutional recurrent neural network optimized by multiple context vectors in EEG-based emotion recognition [J]. Journal of Computer Applications, 2024, 44(7): 2041-2046. |
[13] | Hang YU, Yanling ZHOU, Mengxin ZHAI, Han LIU. Text classification based on pre-training model and label fusion [J]. Journal of Computer Applications, 2024, 44(3): 709-714. |
[14] | Kaitian WANG, Qing YE, Chunlei CHENG. Classification method for traditional Chinese medicine electronic medical records based on heterogeneous graph representation [J]. Journal of Computer Applications, 2024, 44(2): 411-417. |
[15] | Di ZHOU, Zili ZHANG, Jia CHEN, Xinrong HU, Ruhan HE, Jun ZHANG. Stomach cancer image segmentation method based on EfficientNetV2 and object-contextual representation [J]. Journal of Computer Applications, 2023, 43(9): 2955-2962. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||