计算机应用 ›› 2014, Vol. 34 ›› Issue (6): 1636-1640.DOI: 10.11772/j.issn.1001-9081.2014.06.1636

• 人工智能 • 上一篇    下一篇

基于潜在狄利克雷分布模型的多文档情感摘要

荀静1,2,刘培玉1,2,杨玉珍1,2,张艳辉1,2   

  1. 1. 山东省分布式计算机软件新技术重点实验室,济南 250014
    2. 山东师范大学 信息科学与工程学院,济南 250014;
  • 收稿日期:2013-12-23 修回日期:2014-02-19 出版日期:2014-06-01 发布日期:2014-07-02
  • 通讯作者: 荀静
  • 作者简介:荀静(1989-),女,山东临沂人,硕士研究生,CCF会员,主要研究方向:文本摘要、中文倾向性分析;刘培玉(1960-),男,山东潍坊人,教授,博士生导师,主要研究方向:计算机网络信息安全、自然语言处理;杨玉珍(1978-),女,山东菏泽人,博士研究生,主要研究方向:中文倾向性分析;张艳辉(1989-),男,山东滨州人,硕士研究生,CCF会员,主要研究方向:中文倾向性分析。
  • 基金资助:

    国家自然科学基金资助项目;山东省自然科学基金资助项目;国家社会科学基金资助项目

Multi-document sentiment summarization based on latent Dirichlet Allocation model

XUN Jing1,2,LIU Peiyu1,2,YANG Yuzhen1,2,ZHANG Yanhui1,2   

  1. 1. School of Information Science and Engineering, Shandong Normal University, Jinan Shandong 250014, China;
    2. Shandong Provincial Key Laboratory for Distributed Computer Software Novel Technology, Jinan Shandong 250014, China
  • Received:2013-12-23 Revised:2014-02-19 Online:2014-06-01 Published:2014-07-02
  • Contact: XUN Jing
  • Supported by:

    ;National Social Science Fund

摘要:

针对当前方法难以获取评论文本全局情感倾向性的问题,提出一种基于潜在狄利克雷分布(LDA)模型的多文档情感摘要方法。该方法首先对给定的句子进行情感分析,抽取带有主观性评价的句子;然后,应用LDA模型表示已抽取的句子,并通过词汇的重要度和句子的特征计算句子的权重;最终提取情感文摘。实验结果表明,该方法能够有效地识别情感关键句,在准确率、召回率和F值上均有不错的效果。

Abstract:

It is difficult for the existing methods to get overall sentiment orientation of the comment text. To solve this problem, the method of multi-document sentiment summarization based on Latent Dirichlet Allocation (LDA) model was proposed. In this method, all the subjective sentences were extracted by sentiment analysis and described by LDA model, then a summary was generated based on the weight of sentences which combined the importance of words and the characteristics of sentences. The experimental results show that this method can effectively identify key sentiment sentences, and achieve good results in precision, recall and F-measure.

中图分类号: