计算机应用 ›› 2014, Vol. 34 ›› Issue (9): 2571-2576.DOI: 10.11772/j.issn.1001-9081.2014.09.2571

• 人工智能 • 上一篇    下一篇

基于二维坐标映射法的消费情感分类方法

林明明1,邱云飞1,邵良杉2   

  1. 1. 辽宁工程技术大学 软件学院,辽宁 葫芦岛 125105;
    2. 辽宁工程技术大学 系统工程研究所,辽宁 葫芦岛 125105
  • 收稿日期:2014-04-09 修回日期:2014-05-20 出版日期:2014-09-01 发布日期:2014-09-30
  • 通讯作者: 邱云飞
  • 作者简介: 
    林明明(1989-),女,辽宁大连人,硕士研究生,主要研究方向:数据挖掘、情感分析;
    邱云飞(1976-),男,辽宁阜新人,教授,博士,CCF会员,主要研究方向:数据挖掘、情感分析;
    邵良杉(1962-),男,辽宁凌源人,教授,博士,主要研究方向:数据挖掘、情感分析。
  • 基金资助:

    国家自然科学基金资助项目;辽宁省创新团队项目;辽宁省高等学校杰出青年学者成长计划

Consumption sentiment classification based on two-dimensional coordinate mapping method

LIN Mingming1,QIU Yunfei1,SHAO Liangshan2   

  1. 1. School of Software, Liaoning Technical University, Huludao Liaoning 125105, China
    2. System Engineering Institute, Liaoning Technical University, Huludao Liaoning 125105, China
  • Received:2014-04-09 Revised:2014-05-20 Online:2014-09-01 Published:2014-09-30
  • Contact: QIU Yunfei

摘要:

针对中文消费评论的情感分类问题,构建了一种基于语料库的二维坐标映射法的情感分类方法。根据中文语言特点,首先提出了基于语料库的搜索方法,使搜索更有针对性;其次,定义了提取表达情感的中文短语的规则;第三,构造了某领域的最佳种子词选取算法;最后,构造了二维坐标映射算法,通过计算评论句子的坐标值,将其映射到二维直角坐标系中,判断句子的语义倾向性。选取亚马逊网站某商家1200条与牛奶相关的评论(好、差评各600条)进行实验,首先根据最佳种子词选取算法选取“很好漏”作为最佳种子词,再根据二维坐标映射算法判断评论的情感极性,实验的平均F值达到了85%以上。实验结果表明该算法可以对消费评论进行情感分类。

Abstract:

Aiming at the sentiment classification for Chinese consumption comments, a method called two-dimensional coordinate mapping for sentiment classification based on corpus was constructed. According to the Chinese language characteristics, firstly, a more pertinent searching method based on corpus was proposed. Secondly, the rules of extracting the Chinese subjective phrases were defined. Thirdly, the choosing optimal seed words algorithm of the specific field was constructed. Finally, the two-dimensional coordinate mapping algorithm was constructed, which mapped the comment in two-dimensional Cartesian coordinates through calculating the coordinate values of the comment and decided the semantic orientation of it. Experiments were conducted on 1200 comments of milk (half of them are positive or negative comments) in Amazon. In the experiments, word “henhao-lou” was chosen as the optimal seed word by using choosing optimal seed words algorithm, then the sentiment orientation of it was decided according to two-dimensional coordinate mapping algorithm. The average F-measure of the proposed algorithm reached more than 85%. The result shows that the proposed algorithm can classify the sentiment of Chinese consumption comments.

中图分类号: