《计算机应用》唯一官方网站 ›› 2024, Vol. 44 ›› Issue (3): 715-721.DOI: 10.11772/j.issn.1001-9081.2023030358
所属专题: 人工智能
收稿日期:
2023-04-04
修回日期:
2023-06-06
接受日期:
2023-06-08
发布日期:
2023-07-04
出版日期:
2024-03-10
通讯作者:
马应龙
作者简介:
贾宗泽(1997—),男,山西运城人,硕士研究生,主要研究方向:自然语言处理基金资助:
Zongze JIA, Pengfei GAO, Yinglong MA(), Xiaofeng LIU, Haixin XIA
Received:
2023-04-04
Revised:
2023-06-06
Accepted:
2023-06-08
Online:
2023-07-04
Published:
2024-03-10
Contact:
Yinglong MA
About author:
JIA Zongze, born in 1997, M. S. candidate. His research interests include natural language processing.Supported by:
摘要:
目前深度学习模型在对话行为识别中被广泛采用,通过挖掘多种对话行为特征以提升对话行为分类性能。然而,这些方法忽视了不同对话行为特征之间的潜在关联和相互影响,且对话行为分类过程中也很少考虑对话行为标签之间的语义关联关系,这些都妨碍了对话行为识别的性能提升。针对以上问题,提出一种基于注意力机制的多特征融合层次化分类(MFA-HC)方法用于对话行为识别。首先,提出一种基于无遗忘学习的对话行为层次化分类框架,结合词、词性以及相关语言学统计量等多种细粒度特征来学习训练对话行为分类模型;其次,提出一种基于注意力机制的共性-个性模型捕获不同特征之间的共性和个性特征。在两个基准数据集SwDA(Switchboard Dialogue Act corpus)和MRDA(ICSI Meeting Recorder Dialogue Act corpus)上的实验结果表明:相较于目前整体性能较优的DARER(Dual-tAsk temporal Relational rEcurrent Reasoning network),MFA-HC方法通过捕捉话语中隐含的共性和个性特征,分类准确率分别提高了0.6%和0.1%。
中图分类号:
贾宗泽, 高鹏飞, 马应龙, 刘晓峰, 夏海鑫. 基于注意力机制的多特征融合对话行为层次化分类方法[J]. 计算机应用, 2024, 44(3): 715-721.
Zongze JIA, Pengfei GAO, Yinglong MA, Xiaofeng LIU, Haixin XIA. Multi-feature fusion attention-based hierarchical classification method for dialogue act[J]. Journal of Computer Applications, 2024, 44(3): 715-721.
数据集 | 训练集 | 验证集 | 测试集 | |||||
---|---|---|---|---|---|---|---|---|
对话数 | 话语数/103 | 对话数 | 话语数/103 | 对话数 | 话语数/103 | |||
MRDA | 5 | 10 | 51 | 76 | 11 | 15 | 11 | 15 |
SwDA | 42 | 19 | 1 003 | 173 | 112 | 22 | 19 | 4 |
表1 实验数据集信息
Tab. 1 Information of experiment datasets
数据集 | 训练集 | 验证集 | 测试集 | |||||
---|---|---|---|---|---|---|---|---|
对话数 | 话语数/103 | 对话数 | 话语数/103 | 对话数 | 话语数/103 | |||
MRDA | 5 | 10 | 51 | 76 | 11 | 15 | 11 | 15 |
SwDA | 42 | 19 | 1 003 | 173 | 112 | 22 | 19 | 4 |
类型 | 模型 | MRDA | SwDA | ||
---|---|---|---|---|---|
Acc | F1 | Acc | F1 | ||
扁平分类器 | CNN-prosody | 84.7 | 79.3 | 75.1 | 70.6 |
STM | 91.4 | 87.1 | 83.2 | 79.1 | |
SPARTA | 90.2 | 85.9 | 80.1 | 76.2 | |
MDOM | 91.9 | 87.6 | 81.6 | 77.8 | |
DARER | 93.2 | 88.6 | 83.9 | 79.2 | |
层次分类器 | Bi-LSTM-CRF | 90.9 | 85.6 | 79.2 | 74.3 |
NSIM | 89.9 | 85.1 | 80.5 | 76.1 | |
BiRNN-attention | 91.1 | 87.9 | 82.9 | 79.4 | |
Dual-attention | 92.2 | 88.1 | 82.3 | 78.6 | |
HLSN | 90.5 | 86.3 | 82.9 | 76.9 | |
UIIM | 89.9 | 85.7 | 78.6 | 74.8 | |
本文方法 | MFA-HC | 93.3 | 88.5 | 84.4 | 79.5 |
表2 不同模型在SwDA和MRDA上的结果比较 (%)
Tab. 2 Result comparison of different models on SwDA and MRDA
类型 | 模型 | MRDA | SwDA | ||
---|---|---|---|---|---|
Acc | F1 | Acc | F1 | ||
扁平分类器 | CNN-prosody | 84.7 | 79.3 | 75.1 | 70.6 |
STM | 91.4 | 87.1 | 83.2 | 79.1 | |
SPARTA | 90.2 | 85.9 | 80.1 | 76.2 | |
MDOM | 91.9 | 87.6 | 81.6 | 77.8 | |
DARER | 93.2 | 88.6 | 83.9 | 79.2 | |
层次分类器 | Bi-LSTM-CRF | 90.9 | 85.6 | 79.2 | 74.3 |
NSIM | 89.9 | 85.1 | 80.5 | 76.1 | |
BiRNN-attention | 91.1 | 87.9 | 82.9 | 79.4 | |
Dual-attention | 92.2 | 88.1 | 82.3 | 78.6 | |
HLSN | 90.5 | 86.3 | 82.9 | 76.9 | |
UIIM | 89.9 | 85.7 | 78.6 | 74.8 | |
本文方法 | MFA-HC | 93.3 | 88.5 | 84.4 | 79.5 |
类型 | 模型 | 第1层 | 第2层 | 第3层 |
---|---|---|---|---|
层次分类器 | Bi-LSTM-CRF | 90.9 | 86.1 | 79.2 |
NSIM | 92.1 | 87.4 | 80.5 | |
BiRNN-attention | 94.4 | 89.3 | 82.9 | |
Dual-attention | 94.7 | 89.1 | 82.3 | |
HLSN | 93.9 | 88.8 | 81.9 | |
本文方法 | MFA-HC | 94.6 | 90.1 | 84.4 |
表3 MFA-HC和其他层次分类器在SwDA上的准确率 (%)
Tab. 3 Accuracies of MFA-HC and other hierarchical classification models on SwDA
类型 | 模型 | 第1层 | 第2层 | 第3层 |
---|---|---|---|---|
层次分类器 | Bi-LSTM-CRF | 90.9 | 86.1 | 79.2 |
NSIM | 92.1 | 87.4 | 80.5 | |
BiRNN-attention | 94.4 | 89.3 | 82.9 | |
Dual-attention | 94.7 | 89.1 | 82.3 | |
HLSN | 93.9 | 88.8 | 81.9 | |
本文方法 | MFA-HC | 94.6 | 90.1 | 84.4 |
间隔大小 | 不同数据集下的Acc/% | |
---|---|---|
MRDA | SwDA | |
2 | 91.6 | 81.9 |
3 | 92.2 | 83.1 |
4 | 92.9 | 83.9 |
5 | 93.3 | 84.4 |
6 | 92.7 | 83.7 |
7 | 91.4 | 82.6 |
表4 在SwDA和MRDA上话语长度分类不同间隔的准确率对比
Tab. 4 Accuracy comparison with different intervals of utterance length classification on SwDA and MRDA
间隔大小 | 不同数据集下的Acc/% | |
---|---|---|
MRDA | SwDA | |
2 | 91.6 | 81.9 |
3 | 92.2 | 83.1 |
4 | 92.9 | 83.9 |
5 | 93.3 | 84.4 |
6 | 92.7 | 83.7 |
7 | 91.4 | 82.6 |
类型 | 模型 | MRDA | SwDA |
---|---|---|---|
特征 | w/o词性特征 | 90.9 | 82.6 |
w/o语言学统计特征 | 92.2 | 83.9 | |
w/o词性特征&语言学统计特征 | 88.3 | 81.2 | |
融合 | w/o共性 | 91.0 | 82.0 |
w/o个性 | 90.7 | 82.1 | |
w/o UIM | 88.2 | 80.3 | |
MFA-HC | 93.3 | 84.4 |
表5 不同测试集的消融实验结果(准确率) (%)
Tab. 5 Results of ablation study on different test sets (Accuarcy)
类型 | 模型 | MRDA | SwDA |
---|---|---|---|
特征 | w/o词性特征 | 90.9 | 82.6 |
w/o语言学统计特征 | 92.2 | 83.9 | |
w/o词性特征&语言学统计特征 | 88.3 | 81.2 | |
融合 | w/o共性 | 91.0 | 82.0 |
w/o个性 | 90.7 | 82.1 | |
w/o UIM | 88.2 | 80.3 | |
MFA-HC | 93.3 | 84.4 |
1 | MEZZA S, WOBCKE W, BLAIR A. A multi-dimensional, cross-domain and hierarchy-aware neural architecture for ISO-standard dialogue act tagging [C]// Proceedings of the 29th International Conference on Computational Linguistics. Stroudsburg: ACL, 2022: 542-552. |
2 | ORTEGA D, VU N T. Lexico-acoustic neural-based models for dialog act classification[C]// Proceedings of the 2018 IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway: IEEE, 2018: 6194-6198. 10.1109/icassp.2018.8461371 |
3 | SAHA T, PATRA A P, SAHA S, et al. Meta-learning based deferred optimisation for sentiment and emotion aware multi-modal dialogue act classification [C]// Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing. Stroudsburg: ACL, 2022: 978-990. |
4 | LEE J Y, DERNONCOURT F. Sequential short-text classification with recurrent and convolutional neural networks [C]// Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg: ACL, 2016: 515-520. 10.18653/v1/n16-1062 |
5 | ORTEGA D, VU N T. Neural-based context representation learning for dialog act classification [C]// Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue. Stroudsburg: ACL, 2017: 247-252. 10.18653/v1/w17-5530 |
6 | FERSCHKE O, GUREVYCH I, CHEBOTAR Y. Behind the article: recognizing dialog acts in Wikipedia talk pages[C]// Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics. Stroudsburg: ACL, 2012: 777-786. |
7 | KALCHBRENNER N, BLUNSOM P. Recurrent convolutional neural networks for discourse compositionality [C]// Proceedings of the 2013 Workshop on Continuous Vector Space Models and Their Compositionality. Stroudsburg: ACL, 2013:119-126. 10.3115/v1/w14-15 |
8 | HE Z, TAVABI L, LERMAN K, et al. Speaker turn modeling for dialogue act classification [C]// Proceedings of the 7th Conference on Empirical Methodsin Natural Language Processing. Stroudsburg: ACL, 2021:2150-2157. 10.18653/v1/2021.findings-emnlp.185 |
9 | MALHOTRA G, WAHEED A, SRIVASTAVA A, et al. Speaker and time-aware joint contextual learning for dialogue-act classification in counselling conversations [C]// Proceedings of the 15th ACM International Conference on Web Search and Data Mining. New York: ACM, 2022: 735-745. 10.1145/3488560.3498509 |
10 | XING B, TSANG I. DARER: dual-task temporal relational recurrent reasoning network for joint dialog sentiment classification and act recognition [C]// Proceedings of the 2022 Findings of the Association for Computational Linguistics: ACL 2022. Stroudsburg: ACL, 2022: 3611-3621. 10.18653/v1/2022.findings-acl.286 |
11 | SILLA C N,Jr, FREITAS A A. A survey of hierarchical classification across different application domains [J]. Data Mining and Knowledge Discovery, 2011, 22: 31-72. 10.1007/s10618-010-0175-9 |
12 | BIFIS A, TRIGKA M, DEDEGKIKA S, et al. A hierarchical ontology for dialogue acts in psychiatric interviews [C]// Proceedings of the 14th Pervasive Technologies Related to Assistive Environments Conference. New York: ACM, 2021:330-337. 10.1145/3453892.3461349 |
13 | WANG D, LI Z, SHENG D, et al. Balance the labels: hierarchical label structured network for dialogue act recognition[C]// Proceedings of the 2021 International Joint Conference on Neural Networks. Piscataway: IEEE, 2021: 1-8. 10.1109/ijcnn52387.2021.9534022 |
14 | GAO P, MA Y. A universality-individuality integration model for dialog act classification [EB/OL]. [2023-04-28]. . |
15 | LI J, GUO H, CHEN S, et al. A novel semantic inference model with a hierarchical act labels embedded for dialogue act recognition[J]. IEEE Access, 2019,7:167401-167408. 10.1109/access.2019.2944218 |
16 | KUMAR H, AGARWAL A, DASGUPTA R, et al. Dialogue act sequence labeling using hierarchical encoder with CRF [C]// Proceedings of the 32nd AAAI Conference on Artificial Intelligence. Palo Atlo: AAAI Press, 2018:3440-3447. 10.1609/aaai.v32i1.11701 |
17 | BOTHE C, WERMTER S. Conversational analysis of daily dialog data using polite emotional dialogue acts [EB/OL]. [2022-10-11]. . 10.21437/interspeech.2018-2527 |
18 | ZHAO J, LI Y, DU W, et al. FlowEval: a consensus-based dialogue evaluation framework using segment act flows [C]// Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2022: 10469-10483. 10.18653/v1/2022.emnlp-main.715 |
19 | VIELSTED M, WALLENIUS N, VAN DER GOOT R. Increasing robustness for cross-domain dialogue act classification on social media data[C]// Proceedings of the 8th Workshop on Noisy User-generated Text. Stroudsburg: ACL, 2022: 180-193. |
20 | CHAO C H, HOU X J, CHIU Y C. Improve chit-chat and QA sentence classification in user messages of dialogue system using dialogue act embedding[C]// Proceedings of the 33rd Conference on Computational Linguistics and Speech Processing. Stroudsburg: ACL, 2021: 138-143. |
21 | RAHEJA V, TETREAULT J. Dialogue act classification with context-aware self-attention [C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Stroudsburg: ACL, 2019: 3727-3733. 10.18653/v1/n18-2 |
22 | LI R, LIN C, COLLINSON M. A dual-attention hierarchical recurrent neural network for dialogue act classification[C]// Proceedings of the 23rd Conference on Computational Natural Language Learning. Stroudsburg: ACL, 2019: 383-392. 10.18653/v1/k19-1036 |
23 | VENS C, STRUYF J, SCHIETGAT L, et al. Decision trees for hierarchical multi-label classification[J]. Machine Learning, 2008,73: 185-214. 10.1007/s10994-008-5077-3 |
24 | BARROS R C, CERRI R, FREITAS A A, et al. Probabilistic clustering for hierarchical multi-label classification of protein functions [C]// Proceedings of the 2013 Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Cham: Springer, 2013: 385-400. 10.1007/978-3-642-40991-2_25 |
25 | RIBEIRO E, RIBEIRO R, DE MATOS D M. Hierarchical multi-label dialog act recognition on spanish data [EB/OL]. [2022-10-22]. . 10.21437/iberspeech.2018-63 |
26 | ZHANG Y, REN P, DE RIJKE M. A taxonomy, data set, and benchmark for detecting and classifying malevolent dialogue responses [J]. Journal of the Association for Information Science and Technology, 2021, 72(12): 1477-1497. 10.1002/asi.24496 |
27 | LAFFERTY J D, McCALLUM A, PEREIRA F C N. Conditional random fields: probabilistic models for segmenting and labeling sequence data [C]// Proceedings of the 18th International Conference on Machine Learning. New York: ACM, 2001: 282-289. |
28 | FIRDAUS M, GOLCHHA H, EKBAL A, et al. A deep multi-task model for dialogue act classification, intent detection and slot filling [J]. Cognitive Computation, 2021, 13: 626-645. 10.1007/s12559-020-09718-4 |
29 | NANDANWAR L, SHIVAKUMARA P, MONDAL P, et al. Forged text detection in video, scene, and document images [J]. IET Image Processing, 2020, 14: 4744-4755. 10.1049/iet-ipr.2020.0590 |
30 | CAPUANO N, CABALLÉ S, CONESA J, et al. Attention-based hierarchical recurrent neural networks for MOOC forum posts analysis [J]. Journal of Ambient Intelligence and Humanized Computing, 2021, 12: 9977-9989. 10.1007/s12652-020-02747-9 |
31 | LI Z, HOIEM D. Learning without forgetting [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(12): 2935-2947. 10.1109/tpami.2017.2773081 |
[1] | 贾洁茹, 杨建超, 张硕蕊, 闫涛, 陈斌. 基于自蒸馏视觉Transformer的无监督行人重识别[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2893-2902. |
[2] | 潘烨新, 杨哲. 基于多级特征双向融合的小目标检测优化模型[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2871-2877. |
[3] | 刘瑞华, 郝子赫, 邹洋杨. 基于多层级精细特征融合的步态识别算法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2250-2257. |
[4] | 刘越, 刘芳, 武奥运, 柴秋月, 王天笑. 基于自注意力机制与图卷积的3D目标检测网络[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1972-1977. |
[5] | 黄梦源, 常侃, 凌铭阳, 韦新杰, 覃团发. 基于层间引导的低光照图像渐进增强算法[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1911-1919. |
[6] | 韩贵金, 张馨渊, 张文涛, 黄娅. 基于多特征融合的自监督图像配准算法[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1597-1604. |
[7] | 李鸿天, 史鑫昊, 潘卫国, 徐成, 徐冰心, 袁家政. 融合多尺度和注意力机制的小样本目标检测[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1437-1444. |
[8] | 李鑫, 孟乔, 皇甫俊逸, 孟令辰. 基于分离式标签协同学习的YOLOv5多属性分类[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1619-1628. |
[9] | 蒋占军, 吴佰靖, 马龙, 廉敬. 多尺度特征和极化自注意力的Faster-RCNN水漂垃圾识别[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 938-944. |
[10] | 李新叶, 侯晔凝, 孔英会, 燕志旗. 结合特征融合与增强注意力的少样本目标检测[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 745-751. |
[11] | 吴宁, 罗杨洋, 许华杰. 基于多尺度特征融合的遥感图像语义分割方法[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 737-744. |
[12] | 郑宇亮, 陈云华, 白伟杰, 陈平华. 融合事件数据和图像帧的车辆目标检测[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 931-937. |
[13] | 黄巧玲, 郑伯川, 丁梓成, 吴泽东. 融合监督注意力模块和跨阶段特征融合的图像修复改进网络[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 572-579. |
[14] | 崔晨辉, 蔺素珍, 李大威, 禄晓飞, 武杰. 基于孪生网络和Transformer的红外弱小目标跟踪方法[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 563-571. |
[15] | 黄子麒, 胡建鹏. 实体类别增强的汽车领域嵌套命名实体识别[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 377-384. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||