计算机应用 ›› 2020, Vol. 40 ›› Issue (10): 2838-2844.DOI: 10.11772/j.issn.1001-9081.2020020164

• 人工智能 • 上一篇    下一篇

基于文本筛选和改进BERT的长文本方面级情感分析

王昆, 郑毅, 方书雅, 刘守印   

  1. 华中师范大学 物理科学与技术学院, 武汉 430079
  • 收稿日期:2020-02-19 修回日期:2020-04-09 出版日期:2020-10-10 发布日期:2020-06-05
  • 通讯作者: 刘守印
  • 作者简介:王昆(1996-),男,湖北黄冈人,硕士研究生,主要研究方向:自然语言处理、情感分析;郑毅(1993-),男,湖北武汉人,博士研究生,主要研究方向:深度学习、时序信号处理;方书雅(1995-),女,湖北襄阳人,硕士研究生,主要研究方向:人脸识别、目标检测;刘守印(1964-),男,河南周口人,教授,博士,主要研究方向:无线通信、物联网、机器学习。

Long text aspect-level sentiment analysis based on text filtering and improved BERT

WANG Kun, ZHENG Yi, FANG Shuya, LIU Shouyin   

  1. College of Physical Science and Technology, Central China Normal University, Wuhan Hubei 430079, China
  • Received:2020-02-19 Revised:2020-04-09 Online:2020-10-10 Published:2020-06-05

摘要: 方面级情感分析旨在分类出文本在不同方面的情感倾向。在长文本的方面级情感分析中,由于长文本存在的冗余和噪声问题,导致现有的方面级情感分析算法对于长文本中方面相关信息的特征提取不够充分,分类不精准;而在方面分层为粗粒度和细粒度方面的数据集上,现有的解决方案没有利用粗粒度方面中的信息。针对以上问题,提出基于文本筛选和改进BERT的算法TFN+BERT-Pair-ATT。该算法首先利用长短时记忆网络(LSTM)和注意力机制相结合的文本筛选网络(TFN)从长文本中直接筛选出与粗粒度方面相关的部分语句;然后将部分语句按次序进行组合,并与细粒度方面相结合输入至在BERT上增加注意力层的BERT-Pair-ATT中进行特征提取;最后使用Softmax进行情感分类。通过与基于卷积神经网络(CNN)的GCAE(Gated Convolutional Network with Aspect Embedding)、基于LSTM的交互式注意力模型(IAN)等经典模型相比,该算法在验证集上的相关评价指标分别提高了3.66%和4.59%,与原始BERT模型相比提高了0.58%。实验结果表明,基于文本筛选和改进BERT的算法在长文本方面级情感分析任务中具有较大的价值。

关键词: 方面级, 情感分析, 预训练模型, 长短时记忆神经网络, 注意力机制

Abstract: Aspect-level sentiment analysis aims to classify the sentiment of text in different aspects. In the aspect-level sentiment analysis of long text, the existing aspect-level sentiment analysis algorithms do not fully extract the features of aspect related information in the long text due to the redundancy and noise problems, leading to low classification accuracy. On the datasets with coarse and fine aspects, existing solutions do not take advantage of the information in the coarse aspect. In view of the above problems, an algorithm named TFN+BERT-Pair-ATT was proposed based on text filtering and improved Bidirectional Encoder Representation from Transformers (BERT). First, the Text Filter Network (TFN) based on Long Short-Term Memory (LSTM) neural network and attention mechanism was used to directly select part sentences related to the coarse aspect from the long text. Next, the related sentences were associated with others in order, and after combining with fine aspects, the sentences were input into the BERT-Pair-ATT, which is with the attention layer added to the BERT, for feature extraction. Finally, the sentiment classification was performed by using Softmax. Compared with the classical Convolutional Neural Network (CNN) based models such as Gated Convolutional network with Aspect Embedding (GCAE) and LSTM based model Interactive Attention Network (IAN), the proposed algorithm improves the related evaluation index by 3.66% and 4.59% respectively on the validation set, and improves the evaluation index by 0.58% compared with original BERT. Results show that the algorithm based on text filtering and improved BERT has great value in the aspect-level sentiment analysis task of long text.

Key words: aspect-level, sentiment analysis, pre-trained model, Long Short-Term Memory (LSTM) neural network, attention mechanism

中图分类号: