计算机应用 ›› 2016, Vol. 36 ›› Issue (10): 2767-2771.DOI: 10.11772/j.issn.1001-9081.2016.10.2767

• 人工智能 • 上一篇    下一篇

基于用户回复内容观点支持度的评论有用性计算

李学明, 张朝阳, 佘维军   

  1. 重庆大学 计算机学院, 重庆 400030
  • 收稿日期:2016-03-21 修回日期:2016-06-17 出版日期:2016-10-10 发布日期:2016-10-10
  • 通讯作者: 张朝阳,E-mail:nwpuzhangcy@163.com
  • 作者简介:李学明(1967—),男,重庆人,教授,博士,主要研究方向:数据挖掘、大数据处理;张朝阳(1991—),女,河南洛阳人,硕士研究生,主要研究方向:数据挖掘、自然语言处理;佘维军(1991—),男,四川南充人,硕士研究生,主要研究方向:数据挖掘、自然语言处理。
  • 基金资助:
    国家自然科学基金资助项目(90818028)。

Review helpfulness based on opinion support of user discussion

LI Xueming, ZHANG Chaoyang, SHE Weijun   

  1. School of Computer, Chongqing University, Chongqing 400030, China
  • Received:2016-03-21 Revised:2016-06-17 Online:2016-10-10 Published:2016-10-10
  • Supported by:
    BackgroundThis work is partially supported by the National Natural Science Foundation of China (90818028).

摘要: 针对有监督评论有用性预测方法中的训练数据集难以构造,以及无监督方法缺乏对情感信息支撑的问题,提出基于语义和情感信息构建一种无监督模型,用于对评论有用性进行预测,同时考虑了评论和评论下回复内容对观点的支持度用来计算观点的有用性得分,进而得到评论的有用性。同时,提出结合句法分析和改进潜在狄利克雷分配(LDA)模型的评论摘要方法用于评论有用性预测模型中的观点提取,基于句法分析结果构建must-link和cannot-link两种约束条件指导主题模型学习,在保证召回率的同时提高模型准确率。该方法在实验数据集上能取得70%左右的F1值和90%左右的排序准确率,且实例应用也表明该方法对结果具有较好的解释性。

关键词: 评论有用性, 观点支持度, 情感分析, 观点摘要, 用户回复

Abstract: Focusing on the issues in review helpfulness prediction methods that training datasets are difficult to construct in supervised models and unsupervised methods do not take sentiment information in to account, an unsupervised model combining semantics and sentiment information was proposed. Firstly, opinion helpfulness score was calculated based on opinion support score of reviews and replies, and then review helpfulness score was calculated. In addition, a review summary method combining syntactic analysis and improved Latent Dirichlet Allocation (LDA) model was proposed to extract opinions for review helpfulness prediction, and two kinds of constraint conditions named must-link and cannot-link were constructed to guide topic learning based on the result of syntactic analysis, which can improve the accuracy of the model with ensuring the recall rate. The F1 value of the proposed model is 70% and the sorting accuracy is nearly 90% in the experimental data set, and the instance also shows that the proposed model has good explanatory ability.

Key words: review helpfulness, opinion support, sentiment analysis, opinion summarization, user discussion

中图分类号: