Journal of Computer Applications ›› 2018, Vol. 38 ›› Issue (5): 1261-1266.DOI: 10.11772/j.issn.1001-9081.2017112709

Previous Articles     Next Articles

Joint sentiment/topic model integrating user characteristics

XU Yinjie, SUN Chunhua, LIU Yezheng   

  1. School of Management, Hefei University of Technology, Hefei Anhui 230009, China
  • Received:2017-11-15 Revised:2017-12-22 Online:2018-05-10 Published:2018-05-24
  • Contact: 孙春华
  • Supported by:
    This work is partially supported by the Humanity and Social Science Foundation of Ministry of Education (15YJC630111).

考虑用户特征的主题情感联合模型

许银洁, 孙春华, 刘业政   

  1. 合肥工业大学 管理学院, 合肥 230009
  • 通讯作者: 孙春华
  • 作者简介:许银洁(1994-),女,江苏江阴人,硕士研究生,主要研究方向:情感分析、主题模型;孙春华(1977-),女,安徽合肥人,副教授,博士,主要研究方向:消费者行为;刘业政(1965-),男,安徽和县人,教授,博士,主要研究方向:电子商务与商务智能、决策理论与方法。
  • 基金资助:
    教育部人文社科基金资助项目(15YJC630111)。

Abstract: The Joint Sentiment/Topic (JST) model can extract both the topic and the sentiment from the text, but the existing JST model mainly focuses on textual content, without considering the user characteristics, which may lead to demographic and event biases in sentiment mining reports. The Joint-User Sentiment/Topic (JUST) model was proposed. The main improvement of the JUST model was that the user characteristics were added to the model, a linear function of the user characteristics corresponding to the document was used as a priori of the document-emotional distribution, so the model could get emotional tendencies of different topics from customer with different characteristics. The validity of the JUST model was tested on the datasets of 13252 automobile review from autohome.com (www.autohome.com.cn). The experimental results show that the accuracy of the sentiment classification of the JUST model is higher than those of the JST model and TSMMF (Topic Sentiment Model based on Multi-feature Fusion) model. The topic and sentiment differences between users with different characteristics were also compared.

Key words: sentiment analysis, user characteristics, topic model, Latent Dirichlet Allocation (LDA), Gibbs sampling

摘要: 现有的主题情感联合(JST)模型能够同时识别文本中的主题和情感,但是现有的JST模型主要是对文本内容建模,没有考虑用户特征,导致情感分析结果出现用户人口统计偏差和行为事件偏差。提出了考虑用户特征的主题情感联合(JUST)模型,JUST模型的主要改进之处在于,将用户特征加入模型,以文档所对应的用户特征的线性函数作为文档-情感分布的先验,由此得到具有不同特征的用户群体的情感倾向。在汽车之家网站(www.autohome.com.cn)的13252条汽车评论数据集上,检验了JUST模型的有效性,实验结果表明,加入用户特征的JUST模型情感分类效果优于JST模型和TSMMF模型,同时比较了汽车之家网站上不同特征用户之间的关注主题情感差异。

关键词: 情感分析, 用户特征, 主题模型, 隐含狄利克雷分布, 吉布斯采样

CLC Number: