计算机应用 ›› 2015, Vol. 35 ›› Issue (12): 3481-3486.DOI: 10.11772/j.issn.1001-9081.2015.12.3481

• 人工智能 • 上一篇    下一篇

面向产品评论的细粒度情感分析

刘丽, 王永恒, 韦航   

  1. 湖南大学信息科学与工程学院, 长沙 410082
  • 收稿日期:2015-06-23 修回日期:2015-08-02 出版日期:2015-12-10 发布日期:2015-12-10
  • 通讯作者: 刘丽(1990-),女,山西临汾人,硕士研究生,主要研究方向:文本分析、数据挖掘
  • 作者简介:王永恒(1973-),男,河北霸州人,副教授、博士,主要研究方向:大规模数据库、数据挖掘、物联网复杂事件处理;韦航(1990-),女,广西柳州人,硕士,主要研究方向:文本分析、数据挖掘。
  • 基金资助:
    国家自然科学基金资助项目(61371116);湖南省自然科学基金资助项目(13JJ3046)。

Fine-grained sentiment analysis oriented to product comment

LIU Li, WANG Yongheng, WEI Hang   

  1. College of Information Science and Engineering, Hunan University, Changsha Hunan 410082, China
  • Received:2015-06-23 Revised:2015-08-02 Online:2015-12-10 Published:2015-12-10

摘要: 针对传统粗粒度情感分析忽略具体评价对象,以及现有细粒度情感分析方法忽略无关评价要素的问题,提出结合条件随机场(CRF)和语法树剪枝的方法对产品评论进行细粒度情感分析。采用基于MapReduce的并行化协同训练(Tri-training)的方法对语料进行半自主标注,利用融合多种语言特征的条件随机场模型,获取评论中的评价对象和正负面评价词。通过建立领域本体和句法路径库实现语法树剪枝,对含有多个评价对象和评价词的文本,去掉无关评价对象的干扰,抽取出正确的评价单元,最后形成可视化产品报告。实验结果显示,提出的方法在两种不同领域数据集上,识别情感要素的综合准确率达89%左右,情感评价单元的综合准确率也达89%左右。实验结果表明,与传统方法相比,结合CRF和语法树剪枝的方法识别准确率更高,性能更好。

关键词: 产品评论, 细粒度情感分析, MapReduce, 协同训练, 条件随机场, 语法树剪枝

Abstract: The traditional sentiment analysis is coarse-grained and ignores the comment targets, the existing fine-grained sentiment analysis ignores multi-target and multi-opinion sentences. In order to solve these problems, a method of fine-grained sentiment analysis based on Conditional Random Field (CRF) and syntax tree pruning was proposed. A parallel tri-training method based on MapReduce was used to label corpus autonomously. CRF model of integrating various features was used to extract positive/negative opinions and the target of opinions from comment sentences. To deal with the multi-target and multi-opinion sentences, syntax tree pruning was employed through building domain ontology and syntactic path library to eliminate the irrelevant target of opinions and extract the correct appraisal expressions. Finally, a visual product attribute report was generated. After syntax tree pruning, the accuracy of the proposed method on sentiment elements and appraisal expression can reach 89% approximately.The experimental results on two product domains of mobile phone and camera show that the proposed method outperforms the traditional methods on both sentiment analysis accuracy and training performance.

Key words: product comment, fine-grained sentiment analysis, MapReduce, Tri-training, Conditional Random Field (CRF), syntax tree pruning

中图分类号: