基于多层次注意力的语义增强情感分类模型

doi:10.11772/j.issn.1001-9081.2022121894

摘要/Abstract

摘要：

由于自然语言的复杂语义、词的多情感极性以及文本的长期依赖关系，现有的文本情感分类方法面临严峻挑战。针对这些问题，提出了一种基于多层次注意力的语义增强情感分类模型。首先，使用语境化的动态词嵌入技术挖掘词汇的多重语义信息，并且对上下文语义进行建模；其次，通过内部注意力层中的多层并行的多头自注意力捕获文本内部的长期依赖关系，从而获取全面的文本特征信息；再次，在外部注意力层中，将评论元数据中的总结信息通过多层次的注意力机制融入评论特征中，从而增强评论特征的情感信息和语义表达能力；最后，采用全局平均池化层和Softmax函数实现情感分类。在4个亚马逊评论数据集上的实验结果表明，与基线模型中表现最好的TE-GRU （Transformer Encoder with Gated Recurrent Unit）相比，所提模型在App、Kindle、Electronic和CD数据集上的情感分类准确率至少提升了0.36、0.34、0.58和0.66个百分点，验证了该模型能够进一步提高情感分类性能。

关键词: 情感分类, 自然语言处理, 词嵌入, 注意力机制, 神经网络

Abstract:

The existing text sentiment classification methods face serious challenges due to the complex semantics of natural language， the multiple sentiment polarities of words， and the long-term dependency of text. To solve these problems， a semantically enhanced sentiment classification model based on multi-level attention was proposed. Firstly， the contextualized dynamic word embedding technology was used to mine the multiple semantic information of words， and the context semantics was modeled. Secondly， the long-term dependency within the text was captured by the multi-layer parallel multi-head self-attention in the internal attention layer to obtain comprehensive text feature information. Thirdly， in the external attention layer， the summary information in the review metadata was integrated into the review features through a multi-level attention mechanism to enhance the sentiment information and semantic expression ability of the review features. Finally， the global average pooling layer and Softmax function were used to realize sentiment classification. Experimental results on four Amazon review datasets show that， compared with the best-performing TE-GRU （Transformer Encoder with Gated Recurrent Unit） in the baseline models， the proposed model improves the sentiment classification accuracy on App， Kindle， Electronic and CD datasets by at least 0.36， 0.34， 0.58 and 0.66 percentage points， which verifies that the proposed model can further improve the sentiment classification performance.

Key words: sentiment classification, Natural Language Processing (NLP), word embedding, attention mechanism, neural network

中图分类号:

TP391.1

曹建乐, 李娜娜. 基于多层次注意力的语义增强情感分类模型[J]. 计算机应用, 2023, 43(12): 3703-3710.

Jianle CAO, Nana LI. Semantically enhanced sentiment classification model based on multi-level attention[J]. Journal of Computer Applications, 2023, 43(12): 3703-3710.

图/表 11

参考文献 34

1	张公让，鲍超，王晓玉，等. 基于评论数据的文本语义挖掘与情感分析［J］. 情报科学， 2021， 39（5）： 53-61.
	ZHANG G R， BAO C， WANG X Y， et al. Sentiment analysis and text data mining based on reviewing data［J］. Information Science， 2021， 39（5）： 53-61.
2	HU R， RUI L， ZENG P， et al. Text sentiment analysis： a review ［C］// Proceedings of the 2018 IEEE 4th International Conference on Computer and Communications. Piscataway： IEEE， 2018： 2283-2288. 10.1109/compcomm.2018.8780909
3	ZHANG S， WEI Z， WANG Y， et al. Sentiment analysis of Chinese micro-blog text based on extended sentiment dictionary［J］. Future Generation Computer Systems， 2018， 81： 395-403. 10.1016/j.future.2017.09.048
4	VIJAYARAGAVAN P， PONNUSAMY R， ARAMUDHAN M. An optimal support vector machine based classification model for sentimental analysis of online product reviews［J］. Future Generation Computer Systems， 2020， 111： 234-240. 10.1016/j.future.2020.04.046
5	WANG Y. Iteration-based naive bayes sentiment classification of microblog multimedia posts considering emoticon attributes［J］. Multimedia Tools and Applications， 2020， 79： 19151-19166. 10.1007/s11042-020-08797-7
6	赵宏，王乐，王伟杰. 基于BiLSTM-CNN串行混合模型的文本情感分析［J］. 计算机应用， 2020， 40（1）： 16-22.
	ZHAO H， WANG L， WANG W J. Text sentiment analysis based on serial hybrid model of bi-directional long short-term memory and convolutional neural network［J］. Journal of Computer Applications， 2020， 40（1）： 16-22.
7	GAN C， FENG Q， ZHANG Z. Scalable multi-channel dilated CNN-BiLSTM model with attention mechanism for Chinese textual sentiment analysis［J］. Future Generation Computer Systems， 2021， 118： 297-309. 10.1016/j.future.2021.01.024
8	BAHDANAU D， CHO K， BENGIO Y. Neural machine translation by jointly learning to align and translate ［EB/OL］. （2016-05-19）［2022-12-22］. . 10.1017/9781108608480.003
9	DEVLIN J， CHANG M-W， LEE K， et al. BERT： pre-training of deep bidirectional Transformers for language understanding ［EB/OL］. （2019-05-24）［2022-08-27］. . 10.18653/v1/n18-2
10	MIKOLOV T， SUTSKEVER I， CHEN K， et al. Distributed representations of words and phrases and their compositionality ［EB/OL］. （2013-10-16）［2022-06-19］. .
11	PENNINGTON J， SOCHER R， MANNING C D. GloVe： global vectors for word representation ［C］// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg， PA： Association for Computational Linguistics， 2014： 1532-1543. 10.3115/v1/d14-1162
12	PETERS M E， NEUMANN M， IYYER M， et al. Deep contextualized word representations ［EB/OL］. （2018-03-02）［2022-04-09］. . 10.18653/v1/n18-1202
13	VASWANI A， SHAZEER N， PARMAR N， et al. Attention is all you need ［EB/OL］. （2017-06-30）［2022-07-14］. .
14	RADFORD A， NARASIMHAN K， SALIMANS T， et al. Improving language understanding by generative pre-training ［EB/OL］. （2018-06-18）［2022-07-18］. . 10.4324/9781003267836-1
15	KHEDR A E， SALAMA S E， YASEEN HEGAZY N. Predicting stock market behavior using data mining technique and news sentiment analysis ［J］. International Journal of Intelligent Systems and Applications， 2017， 9（7）： 22-30. 10.5815/ijisa.2017.07.03
16	NANDAL N， TANWAR R， PRUTHI J. Machine learning based aspect level sentiment analysis for Amazon products ［J］. Spatial Information Research， 2020， 28： 601-607. 10.1007/s41324-020-00320-2
17	BUDHI G S， CHIONG R， PRANATA I， et al. Using machine learning to predict the sentiment of online reviews： a new framework for comparative analysis ［J］. Archives of Computational Methods in Engineering， 2021， 28： 2543-2566. 10.1007/s11831-020-09464-8
18	KALCHBRENNER N， GREFENSTETTE E， BLUNSOM P. A convolutional neural network for modelling sentences ［EB/OL］. （2014-04-08）［2022-03-22］. . 10.3115/v1/p14-1062
19	REZAEINIA S M， RAHMANI R， GHODSI A， et al. Sentiment analysis based on improved pre-trained word embeddings［J］. Expert Systems with Applications， 2019， 117： 139-147. 10.1016/j.eswa.2018.08.044
20	ZHOU J， LU Y， DAI H-N， et al. Sentiment analysis of Chinese microblog based on stacked bidirectional LSTM ［J］. IEEE Access， 2019， 7： 38856-38866. 10.1109/access.2019.2905048
21	CHATTERJEE A， GUPTA U， CHINNAKOTLA M K， et al. Understanding emotions in text using deep learning and big data［J］. Computers in Human Behavior， 2019， 93： 309-317. 10.1016/j.chb.2018.12.029
22	HASSAN A， MAHMOOD A. Convolutional recurrent deep learning model for sentence classification［J］. IEEE Access， 2018， 6： 13949-13957. 10.1109/access.2018.2814818
23	BATBAATAR E， LI M， RYU K H. Semantic-emotion neural network for emotion recognition from text［J］. IEEE Access， 2019， 7： 111866-111878. 10.1109/access.2019.2934529
24	TAM S， SAID R B， TANRIÖVER Ö Ö. A ConvBiLSTM deep learning model-based approach for Twitter sentiment classification［J］. IEEE Access， 2021， 9： 41283-41293. 10.1109/access.2021.3064830
25	LIU G， GUO J. Bidirectional LSTM with attention mechanism and convolutional layer for text classification［J］. Neurocomputing， 2019， 337： 325-338. 10.1016/j.neucom.2019.01.078
26	LI W， QI F， TANG M， et al. Bidirectional LSTM with self-attention mechanism and multi-channel features for sentiment classification ［J］. Neurocomputing， 2020， 387： 63-77. 10.1016/j.neucom.2020.01.006
27	LIU F， ZHENG J， ZHENG L， et al. Combining attention-based bidirectional gated recurrent neural network and two-dimensional convolutional neural network for document-level sentiment classification ［J］. Neurocomputing， 2020， 371： 39-50. 10.1016/j.neucom.2019.09.012
28	KAMYAB M， LIU G， RASOOL A， et al. ACR-SA： attention-based deep model through two-channel CNN and Bi-RNN for sentiment analysis［J］. PeerJ Computer Science， 2022， 8（4）： e877. 10.7717/peerj-cs.877
29	ZHU Q， JIANG X， YE R. Sentiment analysis of review text based on BiGRU-attention and hybrid CNN ［J］. IEEE Access， 2021， 9： 149077-149088. 10.1109/access.2021.3118537
30	McAULEY J， LESKOVEC J. Hidden factors and hidden topics： understanding rating dimensions with review text ［C］// Proceedings of the 7th ACM Conference on Recommender Systems. New York： ACM， 2013： 165-172. 10.1145/2507157.2507163
31	BASIRI M E， NEMATI S， ABDAR M， et al. ABCDM： an attention-based bidirectional CNN-RNN deep model for sentiment analysis［J］. Future Generation Computer Systems， 2021， 115： 279-294. 10.1016/j.future.2020.08.005
32	DONG J， HE F， GUO Y， et al. A commodity review sentiment analysis based on BERT-CNN model ［C］// Proceedings of the 2020 5th International Conference on Computer and Communication Systems. Piscataway： IEEE， 2020： 143-147. 10.1109/icccs49078.2020.9118434
33	TAN Z， CHEN Z. Sentiment analysis of Chinese short text based on multiple features ［C］// Proceedings of the 2nd International Conference on Computing and Data Science. New York： ACM， 2021： Article No. 65. 10.1145/3448734.3450795
34	ZHANG B， ZHOU W. Transformer-Encoder-GRU （TE-GRU） for Chinese sentiment analysis on Chinese comment text ［EB/OL］. （2021-08-01）［2022-11-18］. . 10.1007/s11063-022-10966-8

数据集	评分数据分布						数据集划分
数据集	总数	5分	4分	3分	2分	1分	训练集	验证集	测试集
App	752 937	386 637	158 081	85 121	44 385	78 713	172 338	24 619	49 239
Kindle	982 619	575 264	254 013	96 194	34 130	23 018	80 008	11 429	22 859
Electronic	1 689 188	1 009 026	347 041	142 257	82 139	108 725	267 211	38 172	76 345
CD	1 097 592	656 676	246 326	101 824	46 571	46 195	129 873	18 553	37 106

数据集	评分数据分布						数据集划分
数据集	总数	5分	4分	3分	2分	1分	训练集	验证集	测试集
App	752 937	386 637	158 081	85 121	44 385	78 713	172 338	24 619	49 239
Kindle	982 619	575 264	254 013	96 194	34 130	23 018	80 008	11 429	22 859
Electronic	1 689 188	1 009 026	347 041	142 257	82 139	108 725	267 211	38 172	76 345
CD	1 097 592	656 676	246 326	101 824	46 571	46 195	129 873	18 553	37 106

数据集	模型	精确率	召回率	F1	准确率
App	IWV	87.09	86.81	86.95	86.80
	SS-BED	86.84	85.87	86.35	85.86
	AC-BiLSTM	88.13	87.55	87.84	87.55
	ACR-SA	90.70	90.64	90.67	90.65
	BiGRU-Att-HCNN	92.33	92.32	92.32	92.31
	ABCDM	90.67	90.56	90.61	90.55
	BERT-CNN	91.70	91.69	91.69	91.68
	MCBAT	92.27	92.19	92.23	92.19
	TE-GRU	92.87	92.84	92.85	92.86
	Our_Model_1	93.53	93.50	93.51	93.51
	Our_Model_2	93.26	93.23	93.24	93.22
Kindle	IWV	91.12	90.80	90.96	90.80
	SS-BED	89.91	89.11	89.51	89.10
	AC-BiLSTM	91.30	90.74	91.02	90.74
	ACR-SA	94.97	93.32	94.14	93.32
	BiGRU-Att-HCNN	95.38	92.62	92.76	92.65
	ABCDM	93.52	93.40	93.46	93.40
	BERT-CNN	92.74	92.56	92.65	92.57
	MCBAT	95.75	93.38	93.88	93.38
	TE-GRU	93.60	93.54	93.57	93.55
	Our_Model_1	94.30	94.29	94.29	94.30
	Our_Model_2	93.90	93.87	93.88	93.89
Electronic	IWV	87.68	87.26	87.47	87.25
	SS-BED	87.20	86.84	87.02	86.84
	AC-BiLSTM	88.51	88.04	88.27	88.04
	ACR-SA	89.87	89.77	89.82	89.77
	BiGRU-Att-HCNN	93.05	93.03	93.04	93.04
	ABCDM	90.89	90.65	90.77	90.65
	BERT-CNN	92.32	92.36	92.34	92.36
	MCBAT	93.14	93.11	93.13	93.12
	TE-GRU	93.28	93.26	93.27	93.27
	Our_Model_1	93.92	93.89	93.90	93.90
	Our_Model_2	93.86	93.85	93.85	93.85
CD	IWV	84.73	84.34	84.53	84.34
	SS-BED	82.36	80.81	81.58	80.82
	AC-BiLSTM	85.44	84.19	84.81	84.19
	ACR-SA	84.72	84.23	84.47	84.23
	BiGRU-Att-HCNN	87.96	87.92	87.94	87.93
	ABCDM	88.92	88.70	88.81	88.70
	BERT-CNN	88.09	88.07	88.08	88.06
	MCBAT	87.24	87.23	87.23	87.24
	TE-GRU	89.27	89.04	89.15	89.04
	Our_Model_1	90.17	90.05	90.11	90.05
	Our_Model_2	89.87	89.72	89.79	89.70

数据集	模型	精确率	召回率	F1	准确率
App	IWV	87.09	86.81	86.95	86.80
	SS-BED	86.84	85.87	86.35	85.86
	AC-BiLSTM	88.13	87.55	87.84	87.55
	ACR-SA	90.70	90.64	90.67	90.65
	BiGRU-Att-HCNN	92.33	92.32	92.32	92.31
	ABCDM	90.67	90.56	90.61	90.55
	BERT-CNN	91.70	91.69	91.69	91.68
	MCBAT	92.27	92.19	92.23	92.19
	TE-GRU	92.87	92.84	92.85	92.86
	Our_Model_1	93.53	93.50	93.51	93.51
	Our_Model_2	93.26	93.23	93.24	93.22
Kindle	IWV	91.12	90.80	90.96	90.80
	SS-BED	89.91	89.11	89.51	89.10
	AC-BiLSTM	91.30	90.74	91.02	90.74
	ACR-SA	94.97	93.32	94.14	93.32
	BiGRU-Att-HCNN	95.38	92.62	92.76	92.65
	ABCDM	93.52	93.40	93.46	93.40
	BERT-CNN	92.74	92.56	92.65	92.57
	MCBAT	95.75	93.38	93.88	93.38
	TE-GRU	93.60	93.54	93.57	93.55
	Our_Model_1	94.30	94.29	94.29	94.30
	Our_Model_2	93.90	93.87	93.88	93.89
Electronic	IWV	87.68	87.26	87.47	87.25
	SS-BED	87.20	86.84	87.02	86.84
	AC-BiLSTM	88.51	88.04	88.27	88.04
	ACR-SA	89.87	89.77	89.82	89.77
	BiGRU-Att-HCNN	93.05	93.03	93.04	93.04
	ABCDM	90.89	90.65	90.77	90.65
	BERT-CNN	92.32	92.36	92.34	92.36
	MCBAT	93.14	93.11	93.13	93.12
	TE-GRU	93.28	93.26	93.27	93.27
	Our_Model_1	93.92	93.89	93.90	93.90
	Our_Model_2	93.86	93.85	93.85	93.85
CD	IWV	84.73	84.34	84.53	84.34
	SS-BED	82.36	80.81	81.58	80.82
	AC-BiLSTM	85.44	84.19	84.81	84.19
	ACR-SA	84.72	84.23	84.47	84.23
	BiGRU-Att-HCNN	87.96	87.92	87.94	87.93
	ABCDM	88.92	88.70	88.81	88.70
	BERT-CNN	88.09	88.07	88.08	88.06
	MCBAT	87.24	87.23	87.23	87.24
	TE-GRU	89.27	89.04	89.15	89.04
	Our_Model_1	90.17	90.05	90.11	90.05
	Our_Model_2	89.87	89.72	89.79	89.70

数据长度	不同数据集的数据量
数据长度	App	Kindle	Electronic	CD
（0，100］	711 887	657 276	1 106 647	466 984
（100，250］	35 279	218 202	398 282	391 393
（250，500］	4 848	84 712	134 294	178 170
（500，1 000］	855	20 886	41 510	54 517
（1 000，3 000］	68	1 543	8 455	6 528