Journal of Computer Applications ›› 2020, Vol. 40 ›› Issue (3): 651-657.DOI: 10.11772/j.issn.1001-9081.2019071210

• Artificial intelligence • Previous Articles     Next Articles

Sentiment analysis using embedding from language model and multi-scale convolutional neural network

ZHAO Ya'ou1,2, ZHANG Jiachong1, LI Yibin3, FU Xianrui1, SHENG Wei1   

  1. 1. Inspur Financial Information Technology Company Limited, Jinan Shandong 250101, China;
    2. School of Information Science and Engineering, University of Jinan, Jinan Shandong 250022, China;
    3. School of Control Science and Engineering, Shandong University, Jinan Shandong 250061, China
  • Received:2019-07-11 Revised:2019-09-07 Online:2020-03-10 Published:2019-09-29
  • Supported by:
    This work is partially supported by the Key Special Project of Cloud Computing and Big Data of the National Key Research and Development Program of China (2016YFB1001100, 2016YFB1001104), the Youth Program of the National Natural Science Foundation of China (61702218).

融合基于语言模型的词嵌入和多尺度卷积神经网络的情感分析

赵亚欧1,2, 张家重1, 李贻斌3, 付宪瑞1, 生伟1   

  1. 1. 浪潮集团 金融信息技术有限公司, 济南 250101;
    2. 济南大学 信息科学与工程学院, 济南 250022;
    3. 山东大学 控制科学与工程学院, 济南 250061
  • 通讯作者: 赵亚欧
  • 作者简介:赵亚欧(1982-),男,山东济南人,讲师,博士,CCF会员,主要研究方向:自然语言处理、人工智能;张家重(1965-),男,山东日照人,教授,博士,主要研究方向:人工智能、数据挖掘;李贻斌(1960-),男,山东聊城人,教授,博士,主要研究方向:机器人、人机交互;付宪瑞(1986-),男,山东济南人,工程师,主要研究方向:软件架构、人工智能;生伟(1983-),男,山东济南人,信息系统项目管理师(高级),主要研究方向:人工智能、金融自助终端。
  • 基金资助:
    国家重点研发计划云计算和大数据重点专项(2016YFB1001100,2016YFB1001104);国家自然科学基金青年项目(61702218)。

Abstract: Only one semantic vector can be generated by word-embedding technologies such as Word2vec or GloVe for polysemous word. In order to solve the problem, a sentiment analysis model based on ELMo (Embedding from Language Model) and Multi-Scale Convolutional Neural Network (MSCNN) was proposed. Firstly, ELMo model was used to learn the pre-training corpus and generate the context-related word vectors. Compared with the traditional word embedding technology, in ELMo model, word features and context features were combined by bidirectional LSTM (Long Short-Term Memory) network to accurately express different semantics of polysemous word. Besides, due to the number of Chinese characters is much more than English characters, ELMo model is difficult to train for Chinese corpus. So the pre-trained Chinese characters were used to initialize the embedding layer of ELMo model. Compared with random initialization, the model training was able to be faster and more accurate by this method. Then, the multi-scale convolutional neural network was applied to secondly extract and fuse the features of word vectors, and generate the semantic representation for the whole sentence. Experiments were carried out on the hotel review dataset and NLPCC2014 task2 dataset. The results show that compared with the attention based bidirectional LSTM model, the proposed model obtain 1.08 percentage points improvement of the accuracy on hotel review dataset, and on NLPCC2014 task2 dataset, the proposed model gain 2.16 percentage points improvement of the accuracy compared with the hybrid model based on LSTM and CNN.

Key words: sentiment analysis, Natural Language Processing (NLP), Convolutional Neural Network (CNN), Embedding from Language Model (ELMo), character embedding

摘要: 针对Word2Vec、GloVe等词嵌入技术对多义词只能产生单一语义向量的问题,提出一种融合基于语言模型的词嵌入(ELMo)和多尺度卷积神经网络(MSCNN)的情感分析模型。首先,该模型利用ELMo学习预训练语料,生成上下文相关的词向量;相较于传统词嵌入技术,ELMo利用双向长短程记忆(LSTM)网络融合词语本身特征和词语上下文特征,能够精确表示多义词的多个不同语义;此外,该模型使用预训练的中文字符向量初始化ELMo的嵌入层,相对于随机初始化,该方法可加快模型的训练速度,提高训练精度;然后,该模型利用多尺度卷积神经网络,对词向量的特征进行二次抽取,并进行特征融合,生成句子的整体语义表示;最后,经过softmax激励函数实现文本情感倾向的分类。实验在公开的酒店评论和NLPCC2014 task2两个数据集上进行,实验结果表明,在酒店评论数据集上与基于注意力的双向LSTM模型相比,该模型正确率提升了1.08个百分点,在NLPCC2014 task2数据集上与LSTM和卷积神经网络(CNN)的混合模型相比,该模型正确率提升了2.16个百分点,证明了所提方法的有效性。

关键词: 情感分析, 自然语言处理, 卷积神经网络, ELMo, 字向量

CLC Number: