计算机应用 ›› 2021, Vol. 41 ›› Issue (10): 2842-2848.DOI: 10.11772/j.issn.1001-9081.2020122043

所属专题: 人工智能

• 人工智能 • 上一篇    下一篇

基于门控机制和卷积神经网络的中文文本情感分析模型

杨璐, 何明祥   

  1. 山东科技大学 计算机科学与工程学院, 山东 青岛266590
  • 收稿日期:2020-12-28 修回日期:2021-04-25 出版日期:2021-10-10 发布日期:2021-07-14
  • 通讯作者: 何明祥
  • 作者简介:杨璐(1995-),女,山东菏泽人,硕士研究生,CCF会员,主要研究方向:文本情感分析、深度学习;何明祥(1969-),男,安徽合肥人,副教授,博士,主要研究方向:数据库系统、信息处理、人工智能。
  • 基金资助:
    2020年度青岛市社会科学规划研究项目(QDSKL2001143)。

Chinese text sentiment analysis model based on gated mechanism and convolutional neural network

YANG Lu, HE Mingxiang   

  1. College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao Shandong 266590, China
  • Received:2020-12-28 Revised:2021-04-25 Online:2021-10-10 Published:2021-07-14
  • Supported by:
    This work is partially supported by the Research Project of 2020 Qingdao Social Sciences Plan (QDSKL2001143).

摘要: 针对中文数据的特殊性导致判别时容易产生噪声信息,使用传统卷积神经网络(CNN)无法深度挖掘情感特征信息等问题,提出了一种结合情感词典的双输入通道门控卷积神经网络(DC-GCNN-SL)模型。首先,使用情感词典的词语情感分数对句子中的词语进行标记,从而使网络获取情感先验知识,并在训练过程中有效地去除了输入句子的噪声信息;然后,在捕获句子深度情感特征时,提出了基于GTRU的门控机制,并通过两个输入通道的文本卷积运算实现两种特征的融合,控制信息传递,有效地得到了更丰富的隐藏信息;最后,通过softmax函数输出文本情感极性。在酒店评论数据集、外卖评论数据集和商品评论数据集上进行了实验。实验结果表明,与文本情感分析的其他模型相比,所提模型具有更好的准确率、精确率、召回率和F1值,能够有效地获取句子的情感特征。

关键词: 自然语言处理, 文本情感分析, 情感词典, 卷积神经网络, 门控机制

Abstract: The particularity of Chinese data leads to noise information generation in the process of discrimination, and the traditional Convolutional Neural Network (CNN) cannot deeply mine emotional feature information. In order to solve the problems, a Dual Channel Gated Convolutional Neural Network model with Sentiment Lexicon (DC-GCNN-SL) was proposed. Firstly, the word sentiment score of sentiment lexicon was used to mark the words in the sentences, so that the prior knowledge of emotion was obtained by the network, and the noise information of the input sentence was effectively removed in the training process. Secondly, a gated mechanism based on GTRU (Gated Tanh-ReLU Unit) was proposed while capturing the deep sentiment features of the sentences, and the text convolution operation of the two input channels was used to fuse the two features, control the information transmission, and obtain more abundant hidden information effectively. Finally, the text sentiment polarity was output through the softmax function. The experiments were carried out on hotel review dataset, takeaway review dataset and commodity review dataset. Experimental results show that, compared with other models of text sentiment analysis, the proposed model has better accuracy, precision, recall and F1-score, and can effectively obtain the emotional features of sentences.

Key words: Natural Language Processing (NLP), text sentiment analysis, sentiment lexicon, Convolutional Neural Network (CNN), gated mechanism

中图分类号: