Fine-grained emotion classification of Chinese microblog based on syntactic dependency graph

doi:10.11772/j.issn.1001-9081.2022030469

Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (4): 1056-1061.DOI: 10.11772/j.issn.1001-9081.2022030469

• Artificial intelligence • Previous Articles

Fine-grained emotion classification of Chinese microblog based on syntactic dependency graph

Cheng FANG¹(), Bei LI², Ping HAN¹, Qiong WU³

^1.College of Electronic Information and Automation，Civil Aviation University of China，Tianjin 300300，China
^2.College of Safety Science and Engineering，Civil Aviation University of China，Tianjin 300300，China
^3.Institute of Computing Technology，Chinese Academy of Sciences，Beijing 100190，China

Received:2022-04-13 Revised:2022-09-27 Accepted:2022-09-28 Online:2023-04-11 Published:2023-04-10
Contact: Cheng FANG
About author:LI Bei， born in 1993， M. S. candidate. Her research interests include natural language processing.
HAN Ping， born in 1966， Ph. D.， professor. Her research interests include signal and information processing， Synthetic Aperture Radar （SAR） object detection and recognition.
WU Qiong， born in 1981， Ph. D. Her research interests include internet tendency analysis， big data mining.
Supported by:
Civil Aviation Safety Capacity Building Fund of CAAC(14002500000019J014)

基于语法依存图的中文微博细粒度情感分类

方澄¹(), 李贝², 韩萍¹, 吴琼³

^1.中国民航大学电子信息与自动化学院, 天津 300300
^2.中国民航大学安全科学与工程学院, 天津 300300
^3.中国科学院计算技术研究所, 北京 100190

通讯作者: 方澄
作者简介:李贝（1993—），女，四川绵阳人，硕士研究生，主要研究方向：自然语言处理；
韩萍（1966—），女，天津人，教授，博士，主要研究方向：信号与信息处理、合成孔径雷达（SAR）目标检测与识别；
吴琼（1981—），女，北京人，博士，主要研究方向：互联网倾向性分析、大数据挖掘。
基金资助:
中国民用航空局安全能力建设资金资助项目(14002500000019J014)

Abstract

Abstract:

Emotion analysis can quickly and accurately dig out users’ emotional tendencies， and has a huge application market. Aiming at the complexity and diversity of the microblog language’s syntactic structures， a Syntax Graph Convolution Network （SGCN） model was proposed for fine-grained emotion classification of Chinese microblog. The proposed model has the characteristics of rich structural and semantic expression at the same time. In the model， a text graph was constructed on the basis of the dependency between words， and the correlation degree between words was quantified by Pointwise Mutual Information （PMI）. After that， the PMI was used as the weight of the corresponding edge to represent the structural information of the sentence. The semantic features fusing location information were taken as the initial features of nodes to increase the semantic features of nodes in the text graph. Experimental results on the microblog emotion classification dataset of Social Media Processing 2020 （SMP2020） show that for two sets of microblog data containing six categories of emotions： happiness， sadness， anger， fear， surprise， and emotionlessness， the average F1-score of the proposed model reaches 72.64% which is 2.75 and 3.87 percentage points higher than those of the BERT （Bidirectional Encoder Representations from Transformers） Graph Convolutional Network （BGCN） model and the Text Level Graph Neural Network （Text-Level-GNN） model， verifying that the proposed model can use the structural information of sentences more effectively to improve the classification performance than other deep learning models.

Key words: microblog, emotion analysis, Graph Convolutional Network (GCN), text graph, deep learning

摘要：

情感分析能从用户言论中快速准确地挖掘用户的情感倾向，有着极大的应用市场。针对微博语言语法结构复杂多样的特性，提出了一种基于语法依存结构的图卷积神经网络（SGCN）模型对中文微博进行细粒度的情感分类。所提模型兼具结构表达和语义表达丰富的特点：基于词语间的依赖关系构建文本图，并通过点互信息（PMI）量化词语间的相关程度，作为相应边的权重以充分表现句子的结构信息；将融合位置信息的语义特征作为节点的初始特征，增加文本图中点的语义特征。为了验证所提模型的性能，在SMP2020（Social Media Processing 2020）微博情感分类数据集上，对两组包含开心、悲伤、愤怒、恐惧、惊讶和无情绪的6类微博情感数据进行了分析。实验结果表明，所提模型的平均F1分数可达到72.64%，相较于BERT（Bidirectional Encoder Representations from Transformers）词向量特征图卷积网络（BGCN）模型和文本级图神经网络（Text-Level-GNN）模型分别提高了2.75和3.87个百分点，验证了所提模型能更有效地利用句子的结构信息，提升模型的分类性能。

关键词: 微博, 情感分析, 图卷积网络, 文本图, 深度学习

CLC Number:

TP391.1

Cheng FANG, Bei LI, Ping HAN, Qiong WU. Fine-grained emotion classification of Chinese microblog based on syntactic dependency graph[J]. Journal of Computer Applications, 2023, 43(4): 1056-1061.

方澄, 李贝, 韩萍, 吴琼. 基于语法依存图的中文微博细粒度情感分类[J]. 《计算机应用》唯一官方网站, 2023, 43(4): 1056-1061.

Figures/Tables 11

References 17

1	SOCHER R， PENNINGTON J， HUANG E H， et al. Semi-supervised recursive autoencoders for predicting sentiment distributions［C］// Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing. Stroudsburg， PA： ACL， 2011： 151-161.
2	KIM Y. Convolutional neural networks for sentence classification［C］// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg， PA： ACL， 2014：1746-1751. 10.3115/v1/d14-1181
3	TAI K S， SOCHER R， MANNING C D. Improved semantic local representations from tree-structured long short-term memory networks［C］// Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing （Volume 1： Long Papers）. Stroudsburg， PA： ACL， 2015： 1556-1566. 10.3115/v1/p15-1150
4	万岩，杜振中. 融合情感词典和语义规则的微博评论细粒度情感分析［J］. 情报探索， 2020（11）：34-41. 10.3969/j.issn.1005-8095.2020.11.005
	WAN Y， DU Z Z. Fine-grained sentiment analysis of microblog comments based on fusion of sentiment lexicon and semantic rules［J］. Information Research， 2020（11）：34-41. 10.3969/j.issn.1005-8095.2020.11.005
5	KIPF T N， WELLING M. Semi-supervised classification with graph convolutional networks［EB/OL］. （2017-02-22）［2022-02-20］.. 10.48550/arXiv.1609.02907
6	YAO L， MAO C S， LUO Y. Graph convolutional networks for text classification［C］// Proceedings of the 33rd AAAI Conference on Artificial Intelligence. Palo Alto， CA： AAAI Press， 2019：7370-7377. 10.1609/aaai.v33i01.33017370
7	LAI Y N， ZHANG L F， HAN D H， et al. Fine-grained emotion classification of Chinese microblogs based on graph convolution networks［J］. World Wide Web， 2020， 23（5）：2771-2787. 10.1007/s11280-020-00803-0
8	王光，李鸿宇，邱云飞，等. 基于图卷积记忆网络的方面级情感分类［J］. 中文信息学报， 2021， 35（8）：98-106. 10.3969/j.issn.1003-0077.2021.08.013
	WANG G， LI H Y， QIU Y F， et al. Aspect-based sentiment classification via memory graph convolutional network［J］. Journal of Chinese Information Processing， 2021， 35（8）：98-106. 10.3969/j.issn.1003-0077.2021.08.013
9	张军莲，张一帆，汪鸣泉，等. 基于图卷积神经网络的中文实体关系联合抽取［J］. 计算机工程， 2021， 47（12）：103-111.
	ZHANG J L， ZHANG Y F， WANG M Q， et al. Joint extraction of Chinese entity relations based on graph convolutional neural network［J］. Computer Engineering， 2021， 47（12）：103-111.
10	ZHAO J， LIU K， XU L H. Book review： sentiment analysis： mining opinions， sentiments， and emotions［J］. Computational Linguistics， 2016， 42（3）：595-598. 10.1162/coli_r_00259
11	YANG Z L， DAI Z H， YANG Y M， et al. XLNet： generalized autoregressive pretraining for language understanding［C/OL］// Proceedings of the 33rd Conference on Neural Information Processing Systems ［2022-02-14］..
12	DAI Z H， YANG Z L， YANG Y M， et al. Transformer-XL： attentive language models beyond a fixed-length context［C］// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg， PA： ACL， 2019：2978-2988. 10.18653/v1/p19-1285
13	LI X W， NING H Y. Deep pyramid convolutional neural network integrated with self-attention mechanism and highway network for text classification［J］. Journal of Physics： Conference Series， 2020， 1642： No.012008. 10.1088/1742-6596/1642/1/012008
14	JOULIN A， GRAVE E， BOJANOWSKI P， et al. Bag of tricks for efficient text classification［C］// Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics （Volume 2， Short Papers）. Stroudsburg， PA： ACL， 2017：427-431. 10.18653/v1/e17-2068
15	ZHOU P， SHI W， TIAN J， et al. Attention-based bidirectional long short-term memory networks for relation classification［C］// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics （Volume 2： Short Papers）. Stroudsburg， PA： ACL， 2016： 207-212. 10.18653/v1/p16-2034
16	HUANG L Z， MA D H， LI S J， et al. Text level graph neural network for text classification［C］// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Stroudsburg， PA： ACL， 2019：3444-3450. 10.18653/v1/d19-1345
17	方澄，李贝，韩萍. 基于全局特征图的半监督微博文本情感分类［J］. 信号处理， 2021， 37（6）：1066-1074. 10.16798/j.issn.1003-0530.2021.06.018
	FANG C， LI B， HAN P. Semi-supervised microblog text sentiment classification based on global feature graph［J］. Journal of Signal Processing， 2021， 37（6）：1066-1074. 10.16798/j.issn.1003-0530.2021.06.018

情感	通用微博	疫情微博
开心	清晨醒来，发现外面下着雨感觉好有诗意	感谢有你@#致敬疫情前线医护人员#
伤心	跌跌撞撞磕磕碰碰这儿疼哪儿酸浑身是伤，崩溃想哭	［泪］［泪］［泪］这几天看的新闻都感觉是一部电视剧
惊讶	关灯后，我竟然听到了蚊子的哼哼声，是我幻听了吗	#新型肺炎最长潜伏期约14天#还有两天才出潜伏期［泪］？？
害怕	事后还是有点心有余悸，安全第一吧！	完全感觉不到明天就是除夕了全国都处在一种恐慌的状态里
生气	嘴巴挑的不得了，以后不给你吃了	歪日［衰］
无情绪	言归正传，先来看看魔神王的背景	普通人能做的：尽量远离人群，外出戴口罩，向家里人科普宣传防护知识

情感	通用微博	疫情微博
开心	清晨醒来，发现外面下着雨感觉好有诗意	感谢有你@#致敬疫情前线医护人员#
伤心	跌跌撞撞磕磕碰碰这儿疼哪儿酸浑身是伤，崩溃想哭	［泪］［泪］［泪］这几天看的新闻都感觉是一部电视剧
惊讶	关灯后，我竟然听到了蚊子的哼哼声，是我幻听了吗	#新型肺炎最长潜伏期约14天#还有两天才出潜伏期［泪］？？
害怕	事后还是有点心有余悸，安全第一吧！	完全感觉不到明天就是除夕了全国都处在一种恐慌的状态里
生气	嘴巴挑的不得了，以后不给你吃了	歪日［衰］
无情绪	言归正传，先来看看魔神王的背景	普通人能做的：尽量远离人群，外出戴口罩，向家里人科普宣传防护知识

数据集	训练集	数据增强	验证集	测试集
通用微博	27 768	30 512	2 000	5 000
疫情微博	8 606	12 419	2 000	3 000

数据集	训练集	数据增强	验证集	测试集
通用微博	27 768	30 512	2 000	5 000
疫情微博	8 606	12 419	2 000	3 000

参数	值	参数	值
迭代轮数（Epoch）	64	Dropout_rate	0.5
优化器（Optimizer）	Adam	权重初始化	随机初始化
初始学习率（Learning_rate）	0.001	隐藏层单元数（Hidden_unit）	200
Batch_size	16

Fine-grained emotion classification of Chinese microblog based on syntactic dependency graph

基于语法依存图的中文微博细粒度情感分类

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 11

References 17

Related Articles 15

Recommended Articles

Metrics

模型	F1值		Macro_F_final
模型	通用	疫情	Macro_F_final
Text-CNN^［2］	63.21	60.11	61.66
DPCNN^［13］	65.64	63.43	64.54
FastText^［14］	64.45	60.80	62.63
LSTM^［3］	65.53	67.70	66.62
BiLSTM^［15］	66.25	70.14	68.20
Text-Level-GNN^［16］	68.10	69.43	68.77
BGCN^［17］	69.65	72.21	69.89
SGCN	71.50	73.77	72.64

[1]	Xu ZHANG, Long SHENG, Haifang ZHANG, Feng TIAN, Wei WANG. Pre-hospital emergency text classification model based on label confusion [J]. Journal of Computer Applications, 2023, 43(4): 1050-1055.
[2]	Zhoubo XU, Puqing CHEN, Huadong LIU, Xin YANG. Deep graph matching model based on self-attention network [J]. Journal of Computer Applications, 2023, 43(4): 1005-1012.
[3]	Rongjun CHEN, Xuanhui YAN, Chaocheng YANG. Fusion imaging-based recurrent capsule classification network for time series [J]. Journal of Computer Applications, 2023, 43(3): 692-699.
[4]	Jiangfeng ZHANG, Tao YAN, Bin CHEN, Yuhua QIAN, Yantao SONG. Multi-depth-of-field 3D shape reconstruction with global spatio-temporal feature coupling [J]. Journal of Computer Applications, 2023, 43(3): 894-902.
[5]	Yingmao YAO, Xiaoyan JIANG. Video-based person re-identification method based on graph convolution network and self-attention graph pooling [J]. Journal of Computer Applications, 2023, 43(3): 728-735.
[6]	Xuedong HE, Shibin XUAN, Kuan WANG, Mengnan CHEN. DeepLabV3+ image segmentation algorithm fusing cumulative distribution function and channel attention mechanism [J]. Journal of Computer Applications, 2023, 43(3): 936-942.
[7]	Tao PENG, Yalong KANG, Feng YU, Zili ZHANG, Junping LIU, Xinrong HU, Ruhan HE, Li LI. Pedestrian trajectory prediction based on multi-head soft attention graph convolutional network [J]. Journal of Computer Applications, 2023, 43(3): 736-743.
[8]	Boyi FU, Yuncong PENG, Xin LAN, Xiaolin QIN. Survey of label noise learning algorithms based on deep learning [J]. Journal of Computer Applications, 2023, 43(3): 674-684.
[9]	Ruoying WANG, Fan LYU, Liuqing ZHAO, Fuyuan HU. Floorplan generation algorithm integrating user requirements and boundary constraints [J]. Journal of Computer Applications, 2023, 43(2): 575-582.
[10]	Yating SU, Cuixiang LIU. Three-dimensional human reconstruction model based on high-resolution net and graph convolutional network [J]. Journal of Computer Applications, 2023, 43(2): 583-588.
[11]	Qi WANG, Hang LEI, Xupeng WANG. Deep face verification under pose interference [J]. Journal of Computer Applications, 2023, 43(2): 595-600.
[12]	Ping WANG, Nan CHEN, Lei LU. Fall detection algorithm based on scene prior and attention guidance [J]. Journal of Computer Applications, 2023, 43(2): 529-535.
[13]	Li’an ZHU, Hong ZHANG. Nonhomogeneous image dehazing based on dual-branch conditional generative adversarial network [J]. Journal of Computer Applications, 2023, 43(2): 567-574.
[14]	Yangping LIN, Jia LIU, Pei CHEN, Mingshu ZHANG, Xiaoyuan YANG. Semi-generative video steganography scheme based on deep convolutional generative adversarial net [J]. Journal of Computer Applications, 2023, 43(1): 169-175.
[15]	Jun ZHANG, Pengli WU, Lukui SHI, Jin SHI, Bin PAN. Deep learning model for multi-station temperature prediction combined with MOD11A1 and surface meteorological station data [J]. Journal of Computer Applications, 2023, 43(1): 321-328.

模型	F1值
模型	通用	疫情
图卷积	63.84	65.21
图卷积+词向量特征	69.00	70.86
图卷积+词向量+表情特征	70.59	72.24
图卷积+词向量+表情特征+PMI特征	71.50	73.77

模型	F1值
模型	通用	疫情
图卷积	63.84	65.21
图卷积+词向量特征	69.00	70.86
图卷积+词向量+表情特征	70.59	72.24
图卷积+词向量+表情特征+PMI特征	71.50	73.77

情绪	Precision				Recall				F1值
	通用		疫情		通用		疫情		通用		疫情
	融合前	融合后	融合前	融合后	融合前	融合后	融合前	融合后	融合前	融合后	融合前	融合后
开心	66.97	72.23	86.28	86.57	66.11	67.19	86.56	88.77	66.53	69.62	86.42	87.66
伤心	54.76	57.74	37.37	56.34	55.56	61.33	50.68	54.79	55.16	59.48	43.02	55.56
惊讶	59.36	54.42	11.01	15.43	34.85	51.21	17.65	36.76	43.92	52.76	13.56	21.74
害怕	57.50	58.33	37.76	52.35	65.71	76.67	28.42	46.84	61.33	66.26	32.43	49.44
生气	71.16	77.29	61.92	69.82	79.79	76.67	63.93	68.47	75.23	76.98	62.91	69.14
无情绪	81.63	81.91	57.73	58.13	76.77	80.51	59.62	64.62	79.13	81.20	58.66	61.20

情绪	Precision				Recall				F1值
	通用		疫情		通用		疫情		通用		疫情
	融合前	融合后	融合前	融合后	融合前	融合后	融合前	融合后	融合前	融合后	融合前	融合后
开心	66.97	72.23	86.28	86.57	66.11	67.19	86.56	88.77	66.53	69.62	86.42	87.66
伤心	54.76	57.74	37.37	56.34	55.56	61.33	50.68	54.79	55.16	59.48	43.02	55.56
惊讶	59.36	54.42	11.01	15.43	34.85	51.21	17.65	36.76	43.92	52.76	13.56	21.74
害怕	57.50	58.33	37.76	52.35	65.71	76.67	28.42	46.84	61.33	66.26	32.43	49.44
生气	71.16	77.29	61.92	69.82	79.79	76.67	63.93	68.47	75.23	76.98	62.91	69.14
无情绪	81.63	81.91	57.73	58.13	76.77	80.51	59.62	64.62	79.13	81.20	58.66	61.20