Chinese implicit sentiment classification model based on sequence and contextual features

doi:10.11772/j.issn.1001-9081.2020111760

Abstract

Abstract: Sentiment analysis of massive text information on social networks can better mine the behavior rules of Internet users,helping decision-making institutions understand the public opinion tendencies and helping businesses improve the quality of service. The task of Chinese implicit sentiment classification is more difficult than those of other languages due to the absence of key emotional features,expression vector forms and cultural customs. The existing Chinese implicit sentiment classification methods are mainly based on Convolutional Neural Network(CNN),and have some defects, such as the inability to obtain the sequence of words and not using contextual emotional features reasonably in implicit emotion discrimination. A Chinese implicit sentiment classification model combining sequence and contextual features named GGBA (GCNN-GRU-BiGRU-Attention) was proposed to solve the above problems. In the model, Gated Convolutional Neural Network (GCNN) was used to extract the local important information of sentences with implicit sentiments,and Gated Recurrent Unit(GRU)network was used to enhance the temporal information of features. In the context feature processing of sentences with implicit sentiments,the combination of Bidirectional Gated Recurrent Unit (BiGRU)and attention was used to extract the important emotional features. After obtaining the two types of features,the contextual important features were integrated into the implicit emotion discrimination through the fusion layer. Experimental results on the implicit sentiment analysis evaluation dataset showed that the macro average precision of GGBA model was 3. 72% higher than that of normal text CNN named TextCNN,2. 57% higher than that of GRU,and 1. 90% higher than that of Disconnected Recurrent Neural Network(DRNN). Therefore,GGBA model achieves better classification performance than the basic models in implicit sentiment analysis tasks.

Key words: Chinese implicit sentiment classification, Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), contextual feature, attention mechanism

摘要： 对社交网络上的海量文本信息进行情感分析可以更好地挖掘网民行为规律，从而帮助决策机构了解舆情倾向以及帮助商家改善服务质量。由于不存在关键情感特征、表达载体形式和文化习俗等因素的影响，中文隐式情感分类任务比其他语言更加困难。已有的中文隐式情感分类方法以卷积神经网络（CNN）为主，这些方法存在着无法获取词语的时序信息和在隐式情感判别中未合理利用上下文情感特征的缺陷。为了解决以上问题，采用门控卷积神经网络（GCNN）提取隐式情感句的局部重要信息，采用门控循环单元（GRU）网络增强特征的时序信息；而在隐式情感句的上下文特征处理上，采用双向门控循环单元（BiGRU）+注意力机制（Attention）的组合提取重要情感特征；在获得两种特征后，通过融合层将上下文重要特征融入到隐式情感判别中；最后得到的融合时序和上下文特征的中文隐式情感分类模型被命名为GGBA。在隐式情感分析评测数据集上进行实验，结果表明所提出的GGBA模型在宏平均准确率上比普通的文本CNN即TextCNN提高了3.72%、比GRU提高了2.57%、比中断循环神经网络（DRNN）提高了1.90%，由此可见， GGBA模型在隐式情感分析任务中比基础模型获得了更好的分类性能。

关键词: 中文隐式情感分类, 卷积神经网络, 循环神经网络, 上下文特征, 注意力机制

CLC Number:

TP391.1

YUAN Jingling, DING Yuanyuan, PAN Donghang, LI Lin. Chinese implicit sentiment classification model based on sequence and contextual features[J]. Journal of Computer Applications, 2021, 41(10): 2820-2828.

袁景凌, 丁远远, 潘东行, 李琳. 基于时序和上下文特征的中文隐式情感分类模型[J]. 计算机应用, 2021, 41(10): 2820-2828.

References

[1] 李勇敢, 周学广, 孙艳, 等. 中文微博情感分析研究与实现[J]. 软件学报, 2017, 28(12):3183-3205.(LI Y G, ZHOU X G, SUN Y, et al. Research and implementation of Chinese Microblog sentiment classification[J]. Journal of Software, 2017, 28(12):3183-3205.)
[2] 黄发良, 冯时, 王大玲, 等. 基于多特征融合的微博主题情感挖掘[J]. 计算机学报, 2017, 40(4):872-888.(HUANG F L, FENG S, WANG D L, et al. Mining topic sentiment in microblogging based on multi-feature fusion[J]. Chinese Journal of Computers, 2017, 40(4):872-888.)
[3] MUKHERJEE A, LIU B. Aspect extraction through semisupervised modeling[C]//Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA:Association for Computational Linguistics, 2012, 1:339-348.
[4] 廖健. 基于表示学习的事实型隐式情感分析研究[D]. 太原:山西大学, 2018:1-138. (LIAO J. Research on fact-implied implicit sentiment analysis based on representation learning[D]. Taiyuan:Shanxi University, 2018:1-138.)
[5] 江腾蛟, 万常选, 刘德喜, 等. 基于语义分析的评价对象-情感词对抽取[J]. 计算机学报, 2017, 40(3):617-633.(JIANG T J, WAN C X, LIU D X, et al. Extracting target-opinion pairs based on semantic analysis[J]. Chinese Journal of Computers, 2017, 40(3):617-633.)
[6] CHEN H Y, CHEN H H. Implicit polarity and implicit aspect recognition in opinion mining[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Stroudsburg:Association for Computational Linguistics, 2016, 2:20-25.
[7] LIU B, ZHANG L. A survey of opinion mining and sentiment analysis[M]//AGGARWAL C C, ZHAI C X. Mining Text Data. Boston, MA:Springer, 2012:415-463.
[8] 李然, 林政, 林海伦, 等. 文本情绪分析综述[J]. 计算机研究与发展, 2018, 55(1):30-52.(LI R, LIN Z, LIN H L, et al. Text emotion analysis:a survey[J]. Journal of Computer Research and Development, 2018, 55(1):30-52.)
[9] WANG H, LIU B, LI C Z, et al. Learning with noisy labels for sentence-level sentiment classification[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing/the 9th International Joint Conference on Natural Language Processing. Stroudsburg, PA:Association for Computational Linguistics, 2019:6286-6292.
[10] 任飞亮, 沈继坤, 孙宾宾, 等. 从文本中构建领域本体技术综述[J]. 计算机学报, 2019, 42(3):654-676.(REN F L, SHEN J K, SUN B B, et al. A review for domain ontology construction from text[J]. Chinese Journal of Computers, 2019, 42(3):654-676.)
[11] 周飞燕, 金林鹏, 董军. 卷积神经网络研究综述[J]. 计算机学报, 2017, 40(6):1229-1251.(ZHOU F Y, JIN L P, DONG J. Review of convolutional neural network[J]. Chinese Journal of Computers, 2017, 40(6):1229-1251.)
[12] KIM Y. Convolutional neural networks for sentence classification[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA:Association for Computational Linguistics, 2014:1746-1751.
[13] ZHANG Y, WALLACE B. A sensitivity analysis of (and practitioners'guide to)convolutional neural networks for sentence classification[C]//Proceedings of the 8th International Joint Conference on Natural Language Processing.[S. l.]:Asian Federation of Natural Language Processing, 2017, 1:253-263.
[14] DAUPHIN Y N, FAN A, AULI M, et al. Language modeling with gated convolutional networks[C]//Proceedings of the 34th International Conference on Machine Learning. New York:JMLR. org, 2017:933-941.
[15] WANG X, LIU Y C, SUN C J, et al. Predicting polarities of tweets by composing word embeddings with long short-term memory[C]//Proceedings of the 53th Annual Meeting of the Association for Computational Linguistics/the 7th International Joint Conference on Natural Language Processing (Volume 1:Long Papers). Stroudsburg, PA:Association for Computational Linguistics, 2015:1343-1353.
[16] WANG B X. Disconnected recurrent neural networks for text categorization[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA:Association for Computational Linguistics, 2018:2311-2320.
[17] DEVLIN J, CHANG M W, LEE K, et al. BERT:pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies. Stroudsburg, PA:Association for Computational Linguistics, 2019, 1:4171-4186.
[18] LIU Y H, OTT M, GOYAL N, et al. RoBERTa:a robustly optimized BERT pretraining approach[EB/OL]. (2019-06-26)[2020-10-11]. https://arxiv.org/pdf/1907.11692.pdf.
[19] YANG Z C, YANG D Y, DYER C, et al. Hierarchical attention networks for document classification[C]//Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies. Stroudsburg, PA:Association for Computational Linguistics, 2016:1480-1489.
[20] AMPLAYO R K, LIM S, HWANG S W. Text length adaptation in sentiment classification[C]//Proceedings of the 11th Asian Conference on Machine Learning. New York:JMLR. org, 2019:646-661.
[21] 张林, 钱冠群, 樊卫国, 等. 轻型评论的情感分析研究[J]. 软件学报, 2014, 25(12):2790-2807.(ZHANG L, QIAN G Q, FAN W G, et al. Sentiment analysis based on light reviews[J]. Journal of Software, 2014, 25(12):2790-2807.)
[22] SUN Z Q, YU H K, SONG X D, et al. MobileBERT:a compact task-agnostic BERT for resource-limited devices[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA:Association for Computational Linguistics, 2020:2158-2170.
[23] 陈铁明, 缪茹一, 王小号. 融合显性和隐性特征的中文微博情感分析[J]. 中文信息学报, 2016, 30(4):184-192.(CHEN T M, MIAO R Y, WANG X H. Chinesemicro-blog sentiment analysis using both explicit and implicit text features[J]. Journal of Chinese Information Processing, 2016, 30(4):184-192.)
[24] SRIVASTAVA N, HINTON G, KRIZHEVSKY A, et al. Dropout:a simple way to prevent neural networks from overfitting[J]. Journal of Machine Learning Research, 2014, 15:1929-1958.
[25] LI S, ZHAO Z, HU R F, et al. Analogical reasoning on Chinese morphological and semantic relations[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA:Association for Computational Linguistics, 2018:138-143.
[26] SHEN X, TIAN X M, SUN S Y, et al. Patch reordering:a novel way to achieve rotation and translation invariance in convolutional neural networks[C]//Proceedings of the 31st AAAI Conference on Artificial Intelligence. Palo Alto, CA:AAAI Press, 2017:2534-2540.