计算机应用 ›› 2017, Vol. 37 ›› Issue (11): 3201-3206.DOI: 10.11772/j.issn.1001-9081.2017.11.3201

• 2017年中国计算机学会人工智能会议(CCFAI 2017) • 上一篇    下一篇

融合异质网络与主题模型的方面分预测

吉余岗1,2, 李依桐1,2, 石川1,2   

  1. 1. 北京邮电大学 计算机学院, 北京 100876;
    2. 智能通信软件与多媒体北京市重点实验室(北京邮电大学), 北京 100876
  • 收稿日期:2017-05-11 修回日期:2017-05-31 出版日期:2017-11-10 发布日期:2017-11-11
  • 通讯作者: 石川
  • 作者简介:吉余岗(1993-),男,江苏泰州人,博士研究生,CCF会员,主要研究方向:数据挖掘、机器学习;李依桐(1992-),女,北京人,硕士,主要研究方向:数据挖掘、机器学习;石川(1978-),男,北京人,教授,博士,CCF会员,主要研究方向:数据挖掘、机器学习、演化计算。
  • 基金资助:
    国家自然科学基金资助项目(61375058);国家973计划项目(2013cb329606);北京市教育委员会共建项目。

Aspect rating prediction based on heterogeneous network and topic model

JI Yugang1,2, LI Yitong1,2, SHI Chuan1,2   

  1. 1. School of Computer Science, Beijing University of Posts and Telecommunications, Beijing 100876, China;
    2. Beijing Key Laboratory of Intelligent Telecommunication Software and Multimedia(Beijing University of Posts and Telecommunications), Beijing 100876, China
  • Received:2017-05-11 Revised:2017-05-31 Online:2017-11-10 Published:2017-11-11
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (61375058), the National Basic Research Program (973 Program) of China (2013cb329606), the Co-construction Project of Beijing Municipal Commission of Education.

摘要: 针对传统方面分预测模型只考虑内容信息而缺乏对评论网络结构的分析,提出了融合异质信息网络和主题模型构建方面分预测算法(HINToAsp)。首先,从意见短语角度构建了评论主题挖掘模型(Phrase-PLSA),有效整合评论信息和评分信息进行方面主题挖掘;进而,考虑用户、评论和商品之间的结构信息,提出了在"用户-评论-商品"异质信息网络上的主题传播模型模型,用于刻画用户特性、商品属性;最后,基于随机游走框架有效整合内容信息和结构信息,进行精准的方面分预测。通过在大众点评(Dianping)和TripAdvisor数据集上和四元组PLSA (QPLSA)、高斯分布的情绪评估(GRAOS)模型及情绪均衡主题模型(SATM)的准确度对比实验,证明了HINToAsp算法的有效性,可以更好地用于商品的推荐系统。

关键词: 方面分预测, 异质信息网络, 主题模型, 结构信息, 推荐系统

Abstract: Concerning the problem that traditional aspect rating prediction methods just pay attention to textual information while ignoring the structural information in the review network, a novel Aspect rating prediction method based on Heterogeneous Information Network and Topic model (HINToAsp) was proposed for effectively integering textual information and structural information. Firstly, a new review topic model of opinion phrases called Phrase-PLSA (Phrase-based Probabilistic Latent Semantic Analysis) was put forward to integrate textual information of reviews and ratings for mining aspect topics. And then, considering the rich structural information among users, reviews, and items, a topic propagation model was designed by the aid of constructing "User-Review-Item" heterogeneous information network. Finally, a random walk framework was used to combine textual information and structural information effectively, which insured an accurate aspect rating prediction. The experimental results on both Dianping corpora and TripAdvisor corpora demonstrate that HINToAsp is more effective than recent methods like the Quad-tuples PLSA (QPLSA) model, the Gaussian distribution for RAting Over Sentiments (GRAOS) model and the Sentiment-Aligned Topic Model (SATM), and has better performance on recommendation system.

Key words: aspect rating prediction, Heterogeneous Information Network (HIN), topic model, structural information, recommendation system

中图分类号: