• •    

CCFAI2017-226-融合异质网络与主题模型的方面分预测

吉余岗1,2,李依桐1,石川1   

  1. 1. 北京邮电大学
    2. 智能通信软件与多媒体北京市重点实验室(北京邮电大学)
  • 收稿日期:2017-05-31 发布日期:2017-05-31
  • 通讯作者: 石川

CCFAI2017-226-Aspect Rating Prediction Based on Heterogeneous Network and Topic Modeling

  • Received:2017-05-31 Online:2017-05-31
  • Contact: Chuan SHI

摘要: 针对传统方面分预测模型只考虑内容信息而缺乏对评论网络结构的分析,提出了融合异质信息网络和主题模型构建方面分预测算法(Heterogeneous Information Network To Aspect rating prediction, HINToAsp)。该算法首先从意见短语角度构建了评论主题挖掘模型(Phrase-based Probabilistic Latent Semantic Analysis, Phrase-PLSA),有效整合评论信息和评分信息进行方面主题挖掘;进而,考虑用户、评论和商品之间的结构信息,提出了在“用户-评论-商品”异质信息网络上的主题传播模型模型,用于刻画用户特性、商品属性;最后基于随机游走框架有效整合内容信息和结构信息,进行精准的方面分预测。通过在大众点评和TripAdvisor数据集上的和四元组PLSA(Quad-tuples PLSA,QPLSA),高斯分布的情绪评估(Gaussian distribution for RAting Over Sentiments,GRAOS)模型及情绪均衡主题模型(Sentiment-Aligned Topic Model,SATM)准确度对比实验,证明了HINToAsp算法的有效性,可以更好地用于商品的推荐系统。

关键词: 方面分预测, 异质信息网络, 主题模型, 结构信息, 推荐系统

Abstract: Concern the problem that traditional aspect rating prediction methods just pay attention to textual information while ignoring the structural information in the review network, a novel Aspect rating prediction method based on Heterogeneous Information Network and topic modeling (HINToAsp) was proposed for effectively integering textual information and structural information. Firstly, a new review topic model of opinion phrases called Phrase-PLSA was put forward to integrate textual information of reviews and ratings for mining aspect topics. And then, considering the rich structural information among users, reviews, and items, a topic propagation model was designed by the aid of constructing “U-R-S” heterogeneous information network. Finally, a random walk framework was used to combine textual information and structural information effectively, which insured an accurate aspect rating prediction. Experimental results on both Dianping corpora and TripAdvisor corpora demonstrate that HINToAsp is more effective than recent methods like the Quad-tuples PLSA (QPLSA) model, the Gaussian distribution for RAting Over Sentiments (GRAOS) model and the Sentiment-Aligned Topic Model (SATM), and indicate the better performance on recommendation system.

Key words: aspect rating prediction, heterogeneous information network, topic modeling, structural information, recommendation system

中图分类号: