计算机应用 ›› 2015, Vol. 35 ›› Issue (10): 2747-2751.DOI: 10.11772/j.issn.1001-9081.2015.10.2747

• 第十五届中国机器学习会议(CCML2015)论文 • 上一篇    下一篇

基于概率潜在语义分析的群体情绪演进分析

林江豪1, 周咏梅1,2, 阳爱民1,2, 陈昱宏1, 陈晓帆1   

  1. 1. 广东外语外贸大学 思科信息学院, 广州 510006;
    2. 广东外语外贸大学 语言工程与计算实验室, 广州 510006
  • 收稿日期:2015-05-27 修回日期:2015-07-05 出版日期:2015-10-10 发布日期:2015-10-14
  • 通讯作者: 林江豪(1985-),男,广东揭阳人,助理工程师,硕士,CCF会员,主要研究方向:自然语言处理、文本情感分析,lin_hao@foxmail.com
  • 作者简介:周咏梅(1971-),女,湖南永州人,教授,CCF高级会员,主要研究方向:文本情感分析、舆情发现;阳爱民(1970-),男,湖南永州人,教授,博士,CCF高级会员,主要研究方向:文本倾向性分析;陈昱宏(1993-),男,广东潮州人,主要研究方向:文本情感分析;陈晓帆(1993-),牙买加人,男,主要研究方向:文本情感分析。
  • 基金资助:
    国家社会科学基金资助项目(12BYY045);教育部新世纪优秀人才支持计划项目(NCET-12-0939);教育部人文社会科学研究项目(14YJA740011);广东省教育厅科技创新项目(2013KJCX0067);2015年广州市哲学社会科学"十二五"规划课题资助项目(15Q16);广东外语外贸大学校级项目(14Q3);广东外语外贸大学研究生科研创新项目(14GWCXXM-36)。

Analysis of public emotion evolution based on probabilistic latent semantic analysis

LIN Jianghao1, ZHOU Yongmei1,2, YANG Aimin1,2, CHEN Yuhong1, CHEN Xiaofan1   

  1. 1. Cisco School of Informatics, Guangdong University of Foreign Studies, Guangzhou Guangdong 510006, China;
    2. Laboratory for Language Engineering and Computing, Guangdong University of Foreign Studies, Guangzhou Guangdong 510006, China
  • Received:2015-05-27 Revised:2015-07-05 Online:2015-10-10 Published:2015-10-14

摘要: 针对群体情绪演进分析中话题内容挖掘及其对应群体情绪分析两个层面的难题,提出了一种基于概率潜在语义分析(PLSA)模型的群体情绪演进分析方法。该方法首先利用PLSA模型抽取时间序列上的子话题,挖掘话题内容随时间的演进规律;再利用句法关系和情感本体库,抽取与话题内容相匹配群体情绪单元,计算情绪单元的强度,形成情绪特征向量;最后,对各子话题下的情绪强度进行求和,细粒度分析子话题和事件的整体群体情绪,深入挖掘群体情绪演进规律,并将群体情绪量化和可视化。在话题情绪单元抽取过程中,引入了句法规则和情感本体库,更细粒度地抽取情绪单元,并提高了话题内容与情绪单元匹配的准确性。实验结果表明,该模型能够实现话题内容及其群体情绪按时序特征的演进分析,验证了所提方法的有效性。

关键词: 群体情绪, 概率潜在语义分析模型, 话题挖掘, 情绪演进, 情绪分析

Abstract: Concerning the problem of topics mining and its corresponding public emotion analysis, an analytical method for public emotion evolution was proposed based on Probabilistic Latent Semantic Analysis (PLSA) model. In order to find out the evolutional patterns of the topics, the method started with extracting the subtopics on time series by making use of PLSA model. Then, emotion feature vectors represented by emotion units and their weights which matched with the topic context were established via parsing and ontology lexicon. Next, the strength of public emotion was computed via a fine-grained dimension and the holistic public emotion of the issue. In this case, the method has a deep mining into the evolutional patterns of public emotion which were finally quantified and visualized. The advantage of the method is highlighted by introducing grammatical rules and ontology lexicon in the process of extracting emotion units, which was conducted in a fine-grained dimension to improve the accuracy of extraction. The experimental results show that this method can gain good performance on the evolutional analysis of topics and public emotion on time series and thus proves the positive effect of the method.

Key words: public emotion, Probabilistic Latent Semantic Analysis (PLSA) model, topic mining, emotion evolution, emotion analysis

中图分类号: