计算机应用 ›› 2018, Vol. 38 ›› Issue (6): 1542-1546.DOI: 10.11772/j.issn.1001-9081.2017122926

• 人工智能 • 上一篇    下一篇

双通道卷积神经网络在文本情感分析中的应用

李平, 戴月明, 吴定会   

  1. 江南大学 物联网工程学院, 江苏 无锡 214122
  • 收稿日期:2017-12-14 修回日期:2018-02-08 出版日期:2018-06-10 发布日期:2018-06-13
  • 通讯作者: 李平
  • 作者简介:李平(1990-),女,江苏连云港人,硕士研究生,主要研究方向:自然语言处理、深度学习;戴月明(1964-),男,江苏无锡人,副教授,硕士,主要研究方向:人工智能、模式识别、自然语言处理;吴定会(1970-),男,江苏无锡人,副教授,博士,主要研究方向:风能发电、机器人。
  • 基金资助:
    国家自然科学基金资助项目(61572237)。

Application of dual-channel convolutional neural network in sentiment analysis

LI ping, DAI Yueming, WU Dinghui   

  1. School of Internet of Things Engineering, Jiangnan University, Wuxi Jiangsu 214122, China
  • Received:2017-12-14 Revised:2018-02-08 Online:2018-06-10 Published:2018-06-13
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (61572237).

摘要: 针对单通道卷积神经网络(CNN)视角单一、不能充分学习到文本的特征信息的问题,提出双通道CNN (DCCNN)算法。首先,采用Word2Vec训练词向量,利用词向量获得句子的语义信息;其次,采用两个不同的通道进行卷积运算,一个通道为字向量,另一个通道为词向量,利用细粒度的字向量辅助词向量捕捉深层次的语义信息;最后,通过不同尺寸的卷积核,发现句子内部更高层次抽象的特征。实验结果表明,所提DCCNN算法能够准确识别文本情感极性,其正确率和F1值均达到95%以上,相比逻辑回归算法、支持向量机(SVM)算法以及CNN算法等都有显著提升。

关键词: 卷积神经网络, 文本情感分析, 词向量, 字向量, 卷积核

Abstract: The single channel Convolutional Neural Network (CNN) cannot fully study the feature information of text with a single perspective. In order to solve the problem, a new Dual-Channel CNN (DCCNN) algorithm was proposed. Firstly, the word vector was trained by Word2Vec, and the semantic information of sentence was obtained by using word vector. Secondly, two different channels were used to carry out convolution operations, one channel was the character vector and the other was the word vector. The fine-grained character vector was used for assisting word vector to capture deep semantic information. Finally, the convolutional kernels of different sizes were used to find higher-level abstract features within the sentence. The experimental results show that, the proposed DCCNN algorithm can accurately identify the sentiment polarity of text, its accuracy and F1 value are above 95%, which are significantly improved compared with the algorithms of logistic regression, Support Vector Machine (SVM) and CNN.

Key words: Convolutional Neural Network (CNN), sentiment analysis, word vector, character vector, convolutional kernel

中图分类号: