计算机应用 ›› 2020, Vol. 40 ›› Issue (9): 2531-2535.DOI: 10.11772/j.issn.1001-9081.2020010128

• 人工智能 • 上一篇    下一篇

基于门控循环单元和胶囊特征的文本情感分析

杨云龙, 孙建强, 宋国超   

  1. 山东科技大学 计算机科学与工程学院, 山东 青岛 266590
  • 收稿日期:2020-02-12 修回日期:2020-03-28 出版日期:2020-09-10 发布日期:2020-03-31
  • 通讯作者: 杨云龙
  • 作者简介:杨云龙(1995-),男,山东潍坊人,硕士研究生,CCF会员,主要研究方向:自然语言处理、情感分析、文本分类;孙建强(1996-),男,山东德州人,硕士研究生,主要研究方向:人工智能、知识图谱;宋国超(1994-),女,山东烟台人,硕士研究生,主要研究方向:位置隐私保护、轨迹数据发布。
  • 基金资助:
    山东科技大学研究生创新项目(SDKDYC190225)。

Text sentiment analysis based on gated recurrent unit and capsule features

YANG Yunlong, SUN Jianqiang, SONG Guochao   

  1. College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao Shandong 266590, China
  • Received:2020-02-12 Revised:2020-03-28 Online:2020-09-10 Published:2020-03-31
  • Supported by:
    This work is partially supported by the Graduate Innovation Funds of Shandong University of Science and Technology (SDKDYC190225).

摘要: 针对简单的循环神经网络(RNN)无法长时间记忆信息和单一的卷积神经网络(CNN)缺乏捕获文本上下文语义的能力的问题,为提升文本分类的准确率,提出一种门控循环单元(GRU)和胶囊特征融合的情感分析模型G-Caps。首先通过GRU捕捉文本的上下文全局特征,获得整体标量信息;其次在初始胶囊层将捕获的信息通过动态路由算法进行迭代,获取到表示文本整体属性的向量化的特征信息;最后在主胶囊部分进行特征间的组合以求获得更准确的文本属性,并根据各个特征的强度大小分析文本的情感极性。在基准数据集MR上进行的实验的结果表明,与初始卷积滤波器的CNN(CNN+INI)和批判学习的CNN(CL_CNN)方法相比,G-Caps的分类准确率分别提升了3.1个百分点和0.5个百分点。由此可见,G-Caps模型有效地提高了实际应用中文本情感分析的准确性。

关键词: 情感分析, 权重共享, 胶囊模型, 门控循环单元动态路由, 文本属性

Abstract: Aiming at the problems that simple Recurrent Neural Network (RNN) cannot memorize information for a long time and single Convolutional Neural Network (CNN) lacks the ability to capture the semantics of text context, in order to improve the accuracy of text classification, a sentiment analysis model G-Caps (Gated Recurrent Unit-Capsule) was proposed, which combines Gated Recurrent Unit (GRU) and capsule features. First, the contextual global features of the text were captured through GRU in order to obtain the global scalar information. Second, the captured information was iterated through the dynamic routing algorithm at the initial capsule layer to obtain the vectorized feature information representing the overall attributes of the text. Finally, the features were combined in the main capsule part to obtain more accurate text attributes, and the sentiment polarity of the text was analyzed according to the intensity of each feature. Experimental results on the benchmark dataset MR (Movie Reviews) showed that compared with the CNN + INI (Convolutional Neural Network + Initializing convolutional filters) and CL_CNN (Critic Learning_Convolutional Neural Network) methods, G-Caps had the classification accuracy increased by 3.1 percentage points and 0.5 percentage points respectively. It can be seen that the G-Caps model effectively improves the accuracy of text sentiment analysis in practice.

Key words: sentiment analysis, weight sharing, capsule model, Gated Recurrent Unit (GRU) dynamic routing, text attribute

中图分类号: