Received:
Revised:
Online:
Contact:
方澄1,李贝1,韩萍1,吴琼2
通讯作者:
基金资助:
Abstract: Nowadays, with the rapid development of network information technology, Microblog, a Chinese social networking software with a high national penetration rate, has become a large news center. Emotion analysis can quickly and accurately extract users' emotional tendencies from their speech and has a great application market. In view of the complex syntax of microblog language, a graph convolutional neural network model based on syntactic dependence graph (Syntax Graph Convolution Network SGCN) is proposed for fine-grained microblog emotion classification. The SGCN model has the characteristics of rich structural expression and semantic expression. The text graph is constructed based on the dependency between words, and the correlation degree between words is quantified by PMI (point wise mutual information) as the weight of the corresponding edge to fully express the structural information of the sentence. The position information is fused on the basis of semantic features, and then used as the initial feature of the node to increase the positional characteristics in the text diagram. In the experiment, the average F1 score of our model reached 72.64 percent for two sets of datasets containing six categories of emotions: happiness, sadness, anger, fear, surprise, and emotionlessness. Experimental results show that under the same conditions, our model can use the structural information of sentences more effectively to improve the classification performance than other deep learning models.
Key words: Microblog, Sentiment analysis, Graph Convolutional Network, Deep Learning
摘要: 在网络信息技术的快速发展下,微博作为当下中国使用率极高的社交软件,已然成为了一个巨大的消息聚集地。情感分析能从用户言论中快速准确地挖掘出用户的情感倾向,有着极大的应用市场。针对微博语言语法结构的复杂多样特性,该文提出了一种基于语法依存结构的图卷积神经网络模型(Syntax Graph Convolution Networks SGCN),用来对中文微博进行细粒度的情感分类。SGCN模型兼具结构表达和语义表达丰富的特点:基于词语间的依赖关系构建文本图,并通过PMI(Pointwise Mutual Information,点互信息)来量化词语间的相关程度,从而作为相应边的权重以充分表现句子的结构信息;将融合位置信息的语义特征作为节点的初始特征,增加文本图中点的语义特征。在实验中,对于两组包含开心、悲伤、愤怒、恐惧、惊讶和无情绪的六类微博情感数据集进行分析,该文模型的平均F1分数可达到72.64%,在同等条件下该文模型比其它深度学习模型能更有效地利用句子的结构信息提升模型的分类性能。
关键词: 微博, 情感分析, 图卷积, 深度学习
CLC Number:
TP391.1
方澄 李贝 韩萍 吴琼. 基于语法依存图的微博细粒度情感分类[J]. .
0 / Recommend
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/