Sentiment analysis using embedding from language model and multi-scale convolutional neural network

doi:10.11772/j.issn.1001-9081.2019071210

Journal of Computer Applications ›› 2020, Vol. 40 ›› Issue (3): 651-657.DOI: 10.11772/j.issn.1001-9081.2019071210

• Artificial intelligence • Previous Articles Next Articles

Sentiment analysis using embedding from language model and multi-scale convolutional neural network

ZHAO Ya'ou^1,2, ZHANG Jiachong¹, LI Yibin³, FU Xianrui¹, SHENG Wei¹

1. Inspur Financial Information Technology Company Limited, Jinan Shandong 250101, China;
2. School of Information Science and Engineering, University of Jinan, Jinan Shandong 250022, China;
3. School of Control Science and Engineering, Shandong University, Jinan Shandong 250061, China

Received:2019-07-11 Revised:2019-09-07 Online:2019-09-29 Published:2020-03-10
Supported by:
This work is partially supported by the Key Special Project of Cloud Computing and Big Data of the National Key Research and Development Program of China (2016YFB1001100, 2016YFB1001104), the Youth Program of the National Natural Science Foundation of China (61702218).

融合基于语言模型的词嵌入和多尺度卷积神经网络的情感分析

赵亚欧^1,2, 张家重¹, 李贻斌³, 付宪瑞¹, 生伟¹

1. 浪潮集团金融信息技术有限公司, 济南 250101;
2. 济南大学信息科学与工程学院, 济南 250022;
3. 山东大学控制科学与工程学院, 济南 250061

通讯作者: 赵亚欧
作者简介:赵亚欧(1982-),男,山东济南人,讲师,博士,CCF会员,主要研究方向:自然语言处理、人工智能;张家重(1965-),男,山东日照人,教授,博士,主要研究方向:人工智能、数据挖掘;李贻斌(1960-),男,山东聊城人,教授,博士,主要研究方向:机器人、人机交互;付宪瑞(1986-),男,山东济南人,工程师,主要研究方向:软件架构、人工智能;生伟(1983-),男,山东济南人,信息系统项目管理师(高级),主要研究方向:人工智能、金融自助终端。
基金资助:
国家重点研发计划云计算和大数据重点专项（2016YFB1001100，2016YFB1001104）；国家自然科学基金青年项目（61702218）。

Abstract

Abstract: Only one semantic vector can be generated by word-embedding technologies such as Word2vec or GloVe for polysemous word. In order to solve the problem, a sentiment analysis model based on ELMo (Embedding from Language Model) and Multi-Scale Convolutional Neural Network (MSCNN) was proposed. Firstly, ELMo model was used to learn the pre-training corpus and generate the context-related word vectors. Compared with the traditional word embedding technology, in ELMo model, word features and context features were combined by bidirectional LSTM (Long Short-Term Memory) network to accurately express different semantics of polysemous word. Besides, due to the number of Chinese characters is much more than English characters, ELMo model is difficult to train for Chinese corpus. So the pre-trained Chinese characters were used to initialize the embedding layer of ELMo model. Compared with random initialization, the model training was able to be faster and more accurate by this method. Then, the multi-scale convolutional neural network was applied to secondly extract and fuse the features of word vectors, and generate the semantic representation for the whole sentence. Experiments were carried out on the hotel review dataset and NLPCC2014 task2 dataset. The results show that compared with the attention based bidirectional LSTM model, the proposed model obtain 1.08 percentage points improvement of the accuracy on hotel review dataset, and on NLPCC2014 task2 dataset, the proposed model gain 2.16 percentage points improvement of the accuracy compared with the hybrid model based on LSTM and CNN.

Key words: sentiment analysis, Natural Language Processing (NLP), Convolutional Neural Network (CNN), Embedding from Language Model (ELMo), character embedding

摘要： 针对Word2Vec、GloVe等词嵌入技术对多义词只能产生单一语义向量的问题，提出一种融合基于语言模型的词嵌入（ELMo）和多尺度卷积神经网络（MSCNN）的情感分析模型。首先，该模型利用ELMo学习预训练语料，生成上下文相关的词向量；相较于传统词嵌入技术，ELMo利用双向长短程记忆（LSTM）网络融合词语本身特征和词语上下文特征，能够精确表示多义词的多个不同语义；此外，该模型使用预训练的中文字符向量初始化ELMo的嵌入层，相对于随机初始化，该方法可加快模型的训练速度，提高训练精度；然后，该模型利用多尺度卷积神经网络，对词向量的特征进行二次抽取，并进行特征融合，生成句子的整体语义表示；最后，经过softmax激励函数实现文本情感倾向的分类。实验在公开的酒店评论和NLPCC2014 task2两个数据集上进行，实验结果表明，在酒店评论数据集上与基于注意力的双向LSTM模型相比，该模型正确率提升了1.08个百分点，在NLPCC2014 task2数据集上与LSTM和卷积神经网络（CNN）的混合模型相比，该模型正确率提升了2.16个百分点，证明了所提方法的有效性。

关键词: 情感分析, 自然语言处理, 卷积神经网络, ELMo, 字向量

CLC Number:

TP183

ZHAO Ya'ou, ZHANG Jiachong, LI Yibin, FU Xianrui, SHENG Wei. Sentiment analysis using embedding from language model and multi-scale convolutional neural network[J]. Journal of Computer Applications, 2020, 40(3): 651-657.

赵亚欧, 张家重, 李贻斌, 付宪瑞, 生伟. 融合基于语言模型的词嵌入和多尺度卷积神经网络的情感分析[J]. 计算机应用, 2020, 40(3): 651-657.

References

[1] ZHU Z,DONG S,YU C,et al. A text hybrid clustering algorithm based on HowNet semantics[J]. Key Engineering Materials,2011, 474/476:2071-2078.
[2] PANG B,LEE L. Opinion mining and sentiment analysis[J]. Foundations and Trends in Information Retrieval,2008,2(1/2):1-135.
[3] MORAES R,VALIATI J F,NETO W P G. Document-level sentiment classification:an empirical comparison between SVM and ANN[J]. Expert Systems with Applications, 2013, 40(2):621-633.
[4] LIU B. Sentiment Analysis:Mining Opinions,Sentiments,And Emotions[M]. New York:Cambridge University Press,2015:47-68.
[5] SOCHER R,PENNINGTON J,HUANG E H,et al. Semi-supervised recursive autoencoders for predicting sentiment distributions[C]//Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing. Stroudsburg:Association for Computational Linguistics,2011:151-161.
[6] QIAN Q,TIAN B,HUANG M,et al. Learning tag embeddings and tag-specific composition functions in recursive neural network[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. Stroudsburg:Association for Computational Linguistics,2015:1365-1374.
[7] WANG X,LIU Y,SUN C,et al. Predicting polarities of tweets by composing word embeddings with long short-term memory[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. Stroudsburg:Association for Computational Linguistics,2015:1343-1353.
[8] WANG X,JIANG W,LUO Z. Combination of convolutional and recurrent neural network for sentiment analysis of short texts[C]//Proceedings of the 26th International Conference on Computational Linguistics. Osaka:The COLING 2016 Organizing Committee,2016:2428-2437.
[9] GUGGILLA C,MILLER T,GUREVYCH I. CNN-and LSTM-based claim classification in online user comments[C]//Proceedings of the 26th International Conference on Computational Linguistics. Osaka:The COLING 2016 Organizing Committee,2016:2740-2751.
[10] AKHTAR S,KUMAR A,GHOSAL D,et al. A multilayer perceptron based ensemble technique for fine-grained financial sentiment analysis[C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Stroudsburg:Association for Computational Linguistics,2017:540-546.
[11] BAHDANAU D,CHO K,BENGIO Y. Neural machine translation by jointly learning to align and translate[EB/OL].[2019-03-12]. https://arxiv.org/pdf/1409.0473.pdf.
[12] 曾锋, 曾碧卿, 韩旭丽, 等. 基于双层注意力循环神经网络的方面级情感分析[J]. 中文信息学报,2019,33(6):108-115. (ZENG F,ZENG B Q,HAN X L,et al. Double attention neural network for aspect-based sentiment analysis[J]. Journal of Chinese Information Processing,2019,33(6):108-115.)
[13] 曾碧卿, 韩旭丽, 王盛玉, 等. 基于双注意力卷积神经网络模型的情感分析研究[J]. 广东工业大学学报,2019,36(4):10-17. (ZENG B Q,HAN X L,WANG S Y,et al. Sentiment classification based on double attention convolutional neural network model[J]. Journal of Guangdong University of Technology,2019,36(4):10-17.)
[14] 韩萍, 孙佳慧, 方澄, 等. 基于情感融合和多维注意力机制的微博文本情感分析[J]. 计算机应用,2019,39(S1):75-78. (HAN P,SUN J H,FANG C,et al. Micro-blog sentiment analysis based on emotional fusion and multi-dimensional self-attention mechanism[J]. Journal of Computer Applications, 2019, 39(S1):75-78.)
[15] 石磊, 张鑫倩, 陶永才, 等. 结合自注意力机制和Tree-LSTM的情感分析模型[J]. 小型微型计算机系统, 2019, 40(7):1486-1490. (SHI L,ZHANG X Q,TAO Y C,et al. Sentiment analysis model with the combination of self-attention and tree-LSTM[J]. Journal of Chinese Computer Systems, 2019, 40(7):1486-1490.)
[16] BENGIO Y,DUCHARME R,VINCENT P,et al. A neural probabilistic language model[J]. Journal of Machine Learning Research,2003,3:1137-1155.
[17] MIKOLOV T,CHEN K,CORRADO G,et al. Efficient estimation of word representations in vector space[EB/OL].[2019-03-072]. https://arxiv.xilesou.top/pdf/1301.3781.pdf.
[18] PENNINGTON J,SOCHER R,MANNING C. Glove:global vectors for word representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg:Association for Computational Linguistics,2014:1532-1543.
[19] KIROS R,ZHU Y,SALAKHUTDINOV R,et al. Skip-thought vectors[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems. Cambridge:MIT Press, 2015:3294-3302.
[20] LOGESWARAN L,LEE H. An efficient framework for learning sentence representations[EB/OL].[2019-03-12]. https://arxiv.org/pdf/1803.02893.pdf.
[21] MCCANN B,BRADBURY J,XIONG C,et al. Learned in translation:Contextualized word vectors[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. New York:Curran Associates Inc.,2017:6297-6308.
[22] CER D,YANG Y,KONG S,et al. Universal sentence encoder[EB/OL].[2019-03-11]. https://arxiv.org/pdf/1803.11175.pdf.
[23] PETERS M,NEUMANN M,IYYER M,et al. Deep contextualized word representations[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies. Stroudsburg:Association for Computational Linguistics,2018:2227-2237.
[24] TAN S,ZHANG J. An empirical study of sentiment analysis for Chinese documents[J]. Expert Systems with Applications,2008, 34(4):2622-2629.
[25] 赵富, 杨洋, 蒋瑞, 等. 融合词性的双注意力Bi-LSTM情感分析[J]. 计算机应用,2018,38(S2):103-106,147.(ZHAO F, YANG Y,JIANG R,et al. Sentiment analysis based on double-attention Bi-LSTM using part-of-speech[J]. Journal of Computer Applications,2018,38(S2):103-106,147.)
[26] WANG X,LI J,YANG X,et al. Chinese text sentiment analysis using bilinear character-word convolutional neural networks[C]//Proceedings of the 2017 International of Conference on Computer Science and Application Engineering. Lancaster,PA:DEStech Publications Inc.,2017:36-43.
[27] 杜永萍, 赵晓铮, 裴兵兵. 基于CNN-LSTM模型的短文本情感分类[J]. 北京工业大学学报,2019,45(7):48-56. (DU Y P, ZHAO X Z,PEI B B. Short text sentiment classification based on CNN-LSTM model[J]. Journal of Beijing University of Technology,2019,45(7):48-56.)

Sentiment analysis using embedding from language model and multi-scale convolutional neural network

融合基于语言模型的词嵌入和多尺度卷积神经网络的情感分析

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics

[1]	Yun LI, Fuyou WANG, Peiguang JING, Su WANG, Ao XIAO. Uncertainty-based frame associated short video event detection method [J]. Journal of Computer Applications, 2024, 44(9): 2903-2910.
[2]	Qi SHUAI, Hairui WANG, Guifu ZHU. Chinese story ending generation model based on bidirectional contrastive training [J]. Journal of Computer Applications, 2024, 44(9): 2683-2688.
[3]	Kaipeng XUE, Tao XU, Chunjie LIAO. Multimodal sentiment analysis network with self-supervision and multi-layer cross attention [J]. Journal of Computer Applications, 2024, 44(8): 2387-2392.
[4]	Quanmei ZHANG, Runping HUANG, Fei TENG, Haibo ZHANG, Nan ZHOU. Automatic international classification of disease coding method incorporating heterogeneous information [J]. Journal of Computer Applications, 2024, 44(8): 2476-2482.
[5]	Hong CHEN, Bing QI, Haibo JIN, Cong WU, Li’ang ZHANG. Class-imbalanced traffic abnormal detection based on 1D-CNN and BiGRU [J]. Journal of Computer Applications, 2024, 44(8): 2493-2499.
[6]	Dongwei WANG, Baichen LIU, Zhi HAN, Yanmei WANG, Yandong TANG. Deep network compression method based on low-rank decomposition and vector quantization [J]. Journal of Computer Applications, 2024, 44(7): 1987-1994.
[7]	Yangyi GAO, Tao LEI, Xiaogang DU, Suiyong LI, Yingbo WANG, Chongdan MIN. Crowd counting and locating method based on pixel distance map and four-dimensional dynamic convolutional network [J]. Journal of Computer Applications, 2024, 44(7): 2233-2242.
[8]	Youren YU, Yangsen ZHANG, Yuru JIANG, Gaijuan HUANG. Chinese named entity recognition model incorporating multi-granularity linguistic knowledge and hierarchical information [J]. Journal of Computer Applications, 2024, 44(6): 1706-1712.
[9]	Tianci KE, Jianhua LIU, Shuihua SUN, Zhixiong ZHENG, Zijie CAI. Aspect-level sentiment analysis model combining strong association dependency and concise syntax [J]. Journal of Computer Applications, 2024, 44(6): 1786-1795.
[10]	Mengyuan HUANG, Kan CHANG, Mingyang LING, Xinjie WEI, Tuanfa QIN. Progressive enhancement algorithm for low-light images based on layer guidance [J]. Journal of Computer Applications, 2024, 44(6): 1911-1919.
[11]	Jianjing LI, Guanfeng LI, Feizhou QIN, Weijun LI. Multi-relation approximate reasoning model based on uncertain knowledge graph embedding [J]. Journal of Computer Applications, 2024, 44(6): 1751-1759.
[12]	Wenshuo GAO, Xiaoyun CHEN. Point cloud classification network based on node structure [J]. Journal of Computer Applications, 2024, 44(5): 1471-1478.
[13]	Min SUN, Qian CHENG, Xining DING. CBAM-CGRU-SVM based malware detection method for Android [J]. Journal of Computer Applications, 2024, 44(5): 1539-1545.
[14]	Longtao GAO, Nana LI. Aspect sentiment triplet extraction based on aspect-aware attention enhancement [J]. Journal of Computer Applications, 2024, 44(4): 1049-1057.
[15]	Xianfeng YANG, Yilei TANG, Ziqiang LI. Aspect-level sentiment analysis model based on alternating‑attention mechanism and graph convolutional network [J]. Journal of Computer Applications, 2024, 44(4): 1058-1064.