Journal of Computer Applications ›› 2021, Vol. 41 ›› Issue (11): 3132-3138.DOI: 10.11772/j.issn.1001-9081.2021010040
Special Issue: 人工智能
• Artificial intelligence • Previous Articles Next Articles
Yu DENG1, Xiaoyu LI1(), Jian CUI2, Qi LIU3
Received:
2021-01-11
Revised:
2021-03-03
Accepted:
2021-03-30
Online:
2021-04-26
Published:
2021-11-10
Contact:
Xiaoyu LI
About author:
DENG Yu,born in 1983,Ph. D. candidate. His research interests include natural language processing,deep learning.Supported by:
通讯作者:
李晓瑜
作者简介:
邓钰(1983—),男,江西景德镇人,博士研究生,主要研究方向:自然语言处理、深度学习; 李晓瑜(1985—),女,山东菏泽人,副教授,博士,主要研究方向:数据挖掘、量子机器学习、大数据; 崔建(1981—),男,辽宁营口人,博士,主要研究方向:信息融合、数据挖掘基金资助:
CLC Number:
Yu DENG, Xiaoyu LI, Jian CUI, Qi LIU. Multi-head attention memory network for short text sentiment classification[J]. Journal of Computer Applications, 2021, 41(11): 3132-3138.
邓钰, 李晓瑜, 崔建, 刘齐. 用于短文本情感分类的多头注意力记忆网络[J]. 《计算机应用》唯一官方网站, 2021, 41(11): 3132-3138.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2021010040
数据集 | 标签类别 | 数据总量 | 测试数据 |
---|---|---|---|
MR | 2 | 10 662 | 1 067 |
SST-1 | 5 | 11 855 | 2 210 |
SST-2 | 2 | 9 613 | 1 821 |
Tab. 1 Experimental data statistics
数据集 | 标签类别 | 数据总量 | 测试数据 |
---|---|---|---|
MR | 2 | 10 662 | 1 067 |
SST-1 | 5 | 11 855 | 2 210 |
SST-2 | 2 | 9 613 | 1 821 |
参数 | 值 |
---|---|
Dropout | 0.2 |
批处理大小 | 32 |
序列最大长度 | 200 |
L2 正则项 | |
隐藏层维度 | 300 |
多头注意力头数 | 8 |
卷积窗口大小 | 2,3,4 |
优化器 | Adam |
Tab. 2 Hyperparameter setting of model
参数 | 值 |
---|---|
Dropout | 0.2 |
批处理大小 | 32 |
序列最大长度 | 200 |
L2 正则项 | |
隐藏层维度 | 300 |
多头注意力头数 | 8 |
卷积窗口大小 | 2,3,4 |
优化器 | Adam |
模型 | 数据集 | ||
---|---|---|---|
MR | SST-1 | SST-2 | |
RAE | 0.777 | 0.432 | 0.824 |
MV-RNN | 0.790 | 0.444 | 0.829 |
RNTN | 0.759 | 0.457 | 0.854 |
CNN-multichannel | 0.811 | 0.474 | 0.881 |
CNN-non-static | 0.815 | 0.480 | 0.872 |
RNN-Capsule | 0.838 | 0.493 | — |
Capsule-CNN | 0.823 | — | 0.868 |
BiLSTM-CRF | 0.823 | 0.485 | 0.883 |
MAMN | 0.842 | 0.496 | 0.887 |
Tab. 3 Classification accuracies of different models on three datasets
模型 | 数据集 | ||
---|---|---|---|
MR | SST-1 | SST-2 | |
RAE | 0.777 | 0.432 | 0.824 |
MV-RNN | 0.790 | 0.444 | 0.829 |
RNTN | 0.759 | 0.457 | 0.854 |
CNN-multichannel | 0.811 | 0.474 | 0.881 |
CNN-non-static | 0.815 | 0.480 | 0.872 |
RNN-Capsule | 0.838 | 0.493 | — |
Capsule-CNN | 0.823 | — | 0.868 |
BiLSTM-CRF | 0.823 | 0.485 | 0.883 |
MAMN | 0.842 | 0.496 | 0.887 |
1 | LIU B. Sentiment Analysis and Opinion Mining [M]. San Rafael: Morgan & Claypool Publishers, 2012: 1-20. 10.1007/978-1-4614-3223-4_13 |
2 | LIN Y, LEI H, WU J, et al. An empirical study on sentiment classification of Chinese review using word embedding [EB/OL]. (2015-11-05) [2019-11-09]. . |
3 | TANG D Y, QIN B, LIU T. Document modeling with gated recurrent neural network for sentiment classification [C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2015: 1422-1432. 10.18653/v1/d15-1167 |
4 | MOUSA A, SCHULLER B. Contextual bidirectional long short-term memory recurrent neural network language models: a generative approach to sentiment analysis [C]// Proceedings of the 2017 15th Conference of the European Chapter of the Association for Computational Linguistics. Stroudsburg: ACL, 2017: 1023-1032. 10.18653/v1/e17-1096 |
5 | LIAO S Y, WANG J B, YU R Y, et al. CNN for situations understanding based on sentiment analysis of twitter data [J]. Procedia Computer Science, 2017, 111: 376-381. 10.1016/j.procs.2017.06.037 |
6 | ZHOU J, HUANG J X, CHEN Q, et al. Deep learning for aspect-level sentiment classification: Survey, vision, and challenges [J]. IEEE Access, 2019, 7: 78454-78483. 10.1109/access.2019.2920075 |
7 | MNIH V, HEESS N, GRAVES A, et al. Recurrent models of visual attention [C]// Proceedings of the 2014 27th International Conference on Neural Information Processing Systems. Cambridge: MIT Press, 2014: 2204-2212. |
8 | BAHDANAU D, CHO K, BENGIO Y. Neural machine translation by jointly learning to align and translate [EB/OL]. (2016-05-19) [2020-12-03]. . 10.3115/v1/w14-4009 |
9 | YIN W P, EBERT S, SCHÜTZE H. Attention-based convolutional neural network for machine comprehension [C]// Proceedings of the 2016 Workshop on Human-Computer Question Answering. Stroudsburg: ACL, 2016: 15-21. 10.18653/v1/w16-0103 |
10 | WANG Y Q, HUANG M L, ZHU X Y, et al. Attention-based LSTM for aspect-level sentiment classification [C]// Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2016: 606-615. 10.18653/v1/d16-1058 |
11 | TANG D Y, QIN B, FENG X C, et al. Effective LSTMs for target-dependent sentiment classification [EB/OL]. [2020-12-03]. . |
12 | MA D H, LI S J, ZHANG X D, et al. Interactive attention networks for aspect-level sentiment classification [C]// Proceedings of the 2017 26th International Joint Conference on Artificial Intelligence. Menlo Park: AAAI Press, 2017: 4068-4074. 10.24963/ijcai.2017/568 |
13 | TANG D Y, QIN B, LIU T. Aspect level sentiment classification with deep memory network [C]// Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2016: 214-224. 10.18653/v1/d16-1021 |
14 | CHEN P, SUN Z Q, BING L D, et al. Recurrent attention network on memory for aspect sentiment analysis [C]// Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2017: 452-461. 10.18653/v1/d17-1047 |
15 | 邓钰,雷航,李晓瑜,等.用于目标情感分类的多跳注意力深度模型[J].电子科技大学学报,2019,48(5):759-766. 10.3969/j.issn.1001-0548.2019.05.016 |
DENG Y, LEI H, LI X Y, et al. A multi-hop attention deep model for aspect-level sentiment classification [J]. Journal of University of Electronic Science and Technology of China, 2019, 48(5): 759-766. 10.3969/j.issn.1001-0548.2019.05.016 | |
16 | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need [C]// Proceedings of the 2017 31st International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2017: 6000-6010. 10.1016/s0262-4079(17)32358-8 |
17 | AMBARTSOUMIAN A, POPOWICH F. Self-attention: a better building block for sentiment analysis neural network classifiers [C]// Proceedings of the 2018 9th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis. Stroudsburg: ACL, 2018: 130-139. 10.18653/v1/w18-6219 |
18 | LETARTE G, PARADIS F, GIGUÈRE P, et al. Importance of self-attention for sentiment analysis [C]// Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Stroudsburg: ACL, 2018: 267-275. 10.18653/v1/w18-5429 |
19 | SONG Y W, WANG J H, JIANG T, et al. Attentional encoder network for targeted sentiment classification [EB/OL]. (2019-04-01) [2020-12-03]. . 10.1007/978-3-030-30490-4_9 |
20 | HAO J, WANG X, SHI S M, et al. Multi-granularity self-attention for neural machine translation [C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing/the 9th International Joint Conference on Natural Language Processing. Stroudsburg: ACL, 2019: 886-897. 10.18653/v1/d19-1082 |
21 | SUKHBAATAR S, SZLAM A, WESTON J, et al. End-to-end memory networks [C]// Proceedings of the 2015 28th International Conference on Neural Information Processing Systems. Cambridge: MIT Press, 2015: 2440-2448. |
22 | PANG B, LEE L. Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales [C]// Proceedings of the 2005 43rd Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2005: 115-124. 10.3115/1219840.1219855 |
23 | SOCHER R, PERELYGIN A, WU J, et al. Recursive deep models for semantic compositionality over a Sentiment Treebank [C]// Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2013: 1631-1642. |
24 | SOCHER R, PENNINGTON J, HUANG E H, et al. Semi-supervised recursive autoencoders for predicting sentiment distributions [C]// Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2011: 151-161. |
25 | SOCHER R, HUVAL B, C Det al MANNING. Semantic compositionality through recursive matrix-vector spaces [C]// Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Stroudsburg: ACL, 2012: 1201-1211. |
26 | KIM Y. Convolutional neural networks for sentence classification [C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2014: 1746-1751. 10.3115/v1/d14-1181 |
27 | WANG Y Q, SUN A X, HAN J L, et al. Sentiment analysis by capsules [C]// Proceedings of the 2018 World Wide Web Conference. New York: ACM, 2018: 1165-1174. 10.1145/3178876.3186015 |
28 | YANG M, ZHAO W, CHEN L, et al. Investigating the transferring capability of capsule networks for text classification [J]. Neural Networks, 2019, 118: 247-261. 10.1016/j.neunet.2019.06.014 |
29 | CHEN T, XU R F, HE Y L, et al. Improving sentiment analysis via sentence type classification using BiLSTM-CRF and CNN [J]. Expert Systems with Applications, 2017, 72: 221-230. 10.1016/j.eswa.2016.10.065 |
[1] | Yuxin HUANG, Jialong XU, Zhengtao YU, Shukai HOU, Jiaqi ZHOU. Unsupervised text sentiment transfer method based on generation prompt [J]. Journal of Computer Applications, 2024, 44(9): 2667-2673. |
[2] | Jinjin LI, Guoming SANG, Yijia ZHANG. Multi-domain fake news detection model enhanced by APK-CNN and Transformer [J]. Journal of Computer Applications, 2024, 44(9): 2674-2682. |
[3] | Pengqi GAO, Heming HUANG, Yonghong FAN. Fusion of coordinate and multi-head attention mechanisms for interactive speech emotion recognition [J]. Journal of Computer Applications, 2024, 44(8): 2400-2406. |
[4] | Caiqin WANG, Yuhao ZHOU, Shunxiang ZHANG, Yanhui WANG, Xiaolong WANG. Aspect-opinion pair extraction of new energy vehicle complaint text based on context enhancement [J]. Journal of Computer Applications, 2024, 44(8): 2430-2436. |
[5] | Hao CHAO, Shuqi FENG, Yongli LIU. Convolutional recurrent neural network optimized by multiple context vectors in EEG-based emotion recognition [J]. Journal of Computer Applications, 2024, 44(7): 2041-2046. |
[6] | Yushan JIANG, Yangsen ZHANG. Large language model-driven stance-aware fact-checking [J]. Journal of Computer Applications, 2024, 44(10): 3067-3073. |
[7] | Yafei ZHANG, Jing WANG, Yaoshuai ZHAO, Zhihao WU, Youfang LIN. Stock movement prediction with market dynamic hierarchical macro information [J]. Journal of Computer Applications, 2023, 43(5): 1378-1384. |
[8] | Kai ZHANG, Zhengchu QIN, Yue LIU, Xinyi QIN. Multi-learning behavior collaborated knowledge tracing model [J]. Journal of Computer Applications, 2023, 43(5): 1422-1429. |
[9] | Yu WANG, Yubo YUAN, Yi GUO, Jiajie ZHANG. Sentiment boosting model for emotion recognition in conversation text [J]. Journal of Computer Applications, 2023, 43(3): 706-712. |
[10] | Jianle CAO, Nana LI. Semantically enhanced sentiment classification model based on multi-level attention [J]. Journal of Computer Applications, 2023, 43(12): 3703-3710. |
[11] | Anqin ZHANG, Xiaohui WANG. Power battery safety warning based on time series anomaly detection [J]. Journal of Computer Applications, 2023, 43(12): 3799-3805. |
[12] | Hong YANG, He ZHANG, Shaoning JIN. Human pose transfer model combining convolution and multi-head attention [J]. Journal of Computer Applications, 2023, 43(11): 3403-3410. |
[13] | Dan XU, Hongfang GONG, Rongrong LUO. Aspect sentiment analysis with aspect item and context representation [J]. Journal of Computer Applications, 2023, 43(10): 3086-3092. |
[14] | Heng CHEN, Siyi WANG, Zhengguang LI, Guanyu LI, Xin LIU. Capsule network knowledge graph embedding model based on relational memory [J]. Journal of Computer Applications, 2022, 42(7): 1985-1992. |
[15] | Yayao ZUO, Haoyu CHEN, Zhiran CHEN, Jiawei HONG, Kun CHEN. Named entity recognition method combining multiple semantic features [J]. Journal of Computer Applications, 2022, 42(7): 2001-2008. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||