计算机应用 ›› 2016, Vol. 36 ›› Issue (1): 175-180.DOI: 10.11772/j.issn.1001-9081.2016.01.0175

• 人工智能 • 上一篇    下一篇

面向产品属性的用户情感模型

贾闻俊1, 张晖1, 杨春明1, 赵旭剑1, 李波1,2   

  1. 1. 西南科技大学 计算机科学与技术学院, 四川 绵阳 621010;
    2. 中国科学技术大学 计算机科学与技术学院, 合肥 230027
  • 收稿日期:2015-07-09 修回日期:2015-09-08 出版日期:2016-01-10 发布日期:2016-01-09
  • 通讯作者: 张晖(1972-),男,安徽宿松人,教授,博士,CCF会员,主要研究方向:数据挖掘、知识工程
  • 作者简介:贾闻俊(1991-),男,四川广元人,硕士研究生,主要研究方向:情感分析、文本分类;杨春明(1980-),男,云南华坪人,副教授,硕士,CCF会员,主要研究方向:文本挖掘、知识工程;赵旭剑(1984-),男,四川西昌人,讲师,博士,CCF会员,主要研究方向:文本挖掘、Web信息检索;李波(1977-),男,四川江油人,讲师,博士研究生,CCF会员,主要研究方向:信息过滤、信息安全。
  • 基金资助:
    四川省教育厅资助项目(14ZB0113);西南科技大学博士基金资助项目(12zx7116)。

User sentiment model oriented to product attribute

JIA Wenjun1, ZHANG Hui1, YANG Chunming1, ZHAO Xujian1, LI Bo1,2   

  1. 1. School of Computer Science and Technology, Southwest University of Science and Technology, Mianyang Sichuan 621010, China;
    2. School of Computer Science and Technology, University of Science and Technology of China, Hefei Anhui 230027, China
  • Received:2015-07-09 Revised:2015-09-08 Online:2016-01-10 Published:2016-01-09
  • Supported by:
    This work is partially supported by the Education Department of Sichuan Province (14ZB0113) and the Fundamental Research Funds for the Doctor of the Southwest Science and Technology University (12zx7116).

摘要: 传统情感模型在分析商品评论中的用户情感时面临两个主要问题:1)缺乏针对产品属性的细粒度情感分析;2)自动提取的产品属性其数量须提前确定。针对上述问题,提出了一种细粒度的面向产品属性的用户情感模型(USM)。首先,利用分层狄利克雷过程(HDP)将名词实体聚类形成产品属性并自动获取其数量;然后,结合产品属性中名词实体的权重和评价短语以及情感词典作为先验,利用潜在狄利克雷分布(LDA)对产品属性进行情感分类。实验结果表明,该模型具有较高的情感分类准确率,情感分类平均准确率达87%。该模型与传统的情感模型相比在抽取产品属性和评价短语的情感分类上具有较高的准确率。

关键词: 情感模型, 细粒度, 产品属性, 分层狄利克雷过程, 潜在狄利克雷分布

Abstract: The traditional sentiment model faces two main problems in analyzing user's emotion of product reviews: 1) the lack of fine-grained emotion analysis for product attributes; 2) the number of product attributes shall be defined in advance. In order to alleviate the problems mentioned above, a fine-grained model for product attributes named User Sentiment Model (USM) was proposed. Firstly, the entities were clustered in product attributes by Hierarchical Dirichlet Processes (HDP) and the number of product attributes could be obtained automatically. Then, the combination of the entity weight in product attributes, the evaluation phrase of product attributes and sentiment lexicon was considered as prior. Finally, Latent Dirichlet Allocation (LDA) was used to classify the emotion of product attributes. The experimental results show that the model achieves a high accuracy in sentiment classification and the average accuracy rate of sentiment classification is 87%. Compared with the traditional sentiment model, the proposed model obtains higher accuracy on extracting product attributes as well as sentiment classification of evaluation phrases.

Key words: sentiment model, fine grain, product attribute, Hierarchical Dirichlet Process (HDP), Latent Dirichlet Allocation (LDA)

中图分类号: