基于经验模态分解自回归组合模型的网络舆情预测

doi:10.11772/j.issn.1001-9081.2017071846

计算机应用 ›› 2018, Vol. 38 ›› Issue (3): 615-619.DOI: 10.11772/j.issn.1001-9081.2017071846

• 人工智能 • 下一篇

基于经验模态分解自回归组合模型的网络舆情预测

莫赞, 赵冰, 黄艳莹

广东工业大学管理学院, 广州 510520

收稿日期:2017-07-31 修回日期:2017-09-18 出版日期:2018-03-10 发布日期:2018-03-07
通讯作者: 赵冰
作者简介:莫赞(1962-),男,广东广州人,教授,博士,主要研究方向:电子商务、管理信息系统;赵冰(1993-),女,河南周口人,硕士研究生,主要研究方向:机器学习、数据挖掘;黄艳莹(1991-),女,广东韶关人,硕士研究生,主要研究方向:机器学习、数据挖掘。
基金资助:
国家自然科学基金资助项目（711710）；"十二五"国家科技支撑计划重大课题（2011BAD13B11）；广东省海洋经济创新发展区域示范专项（GD2013-D01-001）。

Network public opinion prediction by empirical mode decomposition-autoregression based on extreme gradient boosting model

MO Zan, ZHAO Bing, HUANG Yanying

School of Management, Guangdong University of Technology, Guangzhou Guangdong 510520, China

Received:2017-07-31 Revised:2017-09-18 Online:2018-03-10 Published:2018-03-07
Supported by:
This work is partially supported by the National Natural Science Foundation of China (711710); the "Twelfth Five-Year" National Science and Technology Support Program Major Issues (2011BAD13B11); the Guangdong Provincial Regional Demonstration Project for Marine Economic Innovation and Development (GD2013-D01-001).

摘要/Abstract

摘要： 随着大数据时代的到来，网络舆情数据呈现信息量大和领域覆盖广等特征。面对复杂的网络舆情数据时，传统单一模型预测能力有限，不能对舆情趋势进行有效预测。针对此问题，提出一种基于经验模态分解-自回归（EMD-AR）改进的组合模型——EMD-ARXG模型，应用于复杂网络舆情的预测。该模型利用经验模态分解算法对时间序列进行分解，然后通过自回归模型对分解后的时间序列进行各自趋势拟合，建立子模型。最后再对各个子模型进行重构，完成建模。另外，在利用自回归（AR）模型拟合过程中，为了减少拟合误差，采用极限梯度提升算法对残差进行学习，并使预测模型迭代更新，提高各个子模型预测精度。为验证EMD-ARXG模型的预测效果，该模型与小波神经网络模型和基于经验模态分解的神经网络模型进行实验对比。实验结果表明，在均方根误差（RMSE）、平均绝对百分误差（MAPE）和希尔不等系数（TIC）三项指标上，EMD-ARXG模型获得的结果均优于小波神经网络模型和基于经验模态分解的神经网络模型的结果。

关键词: 趋势拟合, 网络舆情预测, 经验模态分解, 自回归, 极限梯度提升, 残差学习

Abstract: With the arrival of big data, network public opinion data reveals the features of massive information and wide coverage. For the complicated network public opinion data, traditional single models may not efficiently predict the trend of network public opinion. To address this question, the improved combination model based on the Empirical Mode Decomposition-AutoRegression (EMD-AR) model was proposed, called EMD-ARXG (Empirical Mode Decomposition-AutoRegression based on eXtreme Gradient boosting)model. EMD-ARXG model was applied to the prediction of the trend of complex network public opinion. In this model, the Empirical Mode Decomposition (EMD) algorithm was employed to decompose the time series, and then AutoRegression (AR) model was applied to fit the decomposed time series and establish sub-models. Finally, the sub-models were reconstructed and then the modelling process was completed. In addition, in the fitting process AR model, in order to reduce the fitting error, the residual error was learned by eXtreme Gradient Boosting (XGBoost), and each sub-model was iteratively updated to improve its prediction accuracy. In order to verify the prediction performance of EMD-ARXG model, the proposed model was compared with wavelet neural network model and back propagation neural network based on EMD model. The experimental results show that the EMD-ARXG model is superior to two other models in terms of the statistical indicators including Root Mean Square Error (RMSE), Mean Absolute Percentage Error (MAPE) and Theil Inequality Coefficient (TIC).

Key words: trend fitting, network public opinion prediction, Empirical Mode Decomposition (EMD), AutoRegression (AR), eXtreme Gradient Boosting (XGBoost), residual learning

中图分类号:

莫赞, 赵冰, 黄艳莹. 基于经验模态分解自回归组合模型的网络舆情预测[J]. 计算机应用, 2018, 38(3): 615-619.

MO Zan, ZHAO Bing, HUANG Yanying. Network public opinion prediction by empirical mode decomposition-autoregression based on extreme gradient boosting model[J]. Journal of Computer Applications, 2018, 38(3): 615-619.

参考文献

[1] CERON A, NEGRI F. The "social side" of public policy:monitoring online public opinion and its mobilization during the policy cycle[J]. Policy & Internet, 2016, 8(2):131-147.
[2] LEEPER T J, SLOTHUUS R. Political parties, motivated reasoning, and public opinion formation[J]. Political Psychology, 2014, 35(S1):129-156.
[3] 陈福集,李林斌.G(Galam)模型在网络舆情演化中的应用[J].计算机应用,2011,31(12):3411-3413.(CHEN F J, LI L B. Application of G (Galam) model in network public opinion evolution[J]. Journal of Computer Applications, 2011, 31(12):3411-3413.)
[4] URBAN J, BULKOW K. Tracing public opinion online-an example of use for social network analysis in communication research[J]. Procedia-Social and Behavioral Sciences, 2013, 100(7):108-126.
[5] 方薇,何留进,孙凯,等.采用元胞自动机的网络舆情传播模型研究[J].计算机应用,2010,30(3):751-755.(FANG W, HE L J, SUN K, et al. Study on dissemination model of network public sentiment based on cellular automata[J]. Journal of Computer Applications, 2010, 30(3):751-755.)
[6] JAMALI S, RANGWALA H. Digging Digg:comment mining, popularity prediction, and social network analysis[C]//Proceedings of the 2009 International Conference on Web Information Systems and Mining. Washington, DC:IEEE Computer Society, 2009:32-38.
[7] 魏超.新媒体技术发展对网络舆情信息工作的影响研究[J].图书情报工作,2014,58(1):30-34.(WEI C. Study on the impact of new media technology development on Internet public opinion information work[J]. Library and Information Service, 2014, 58(1):30-34.)
[8] 柯赟.基于动态贝叶斯网络的舆情预测模型研究[J].统计与决策,2016(20):26-28.(KE Y. Research on network public opinion prediction model based on dynamic Bayesian network[J]. Statistics and Decision, 2016(20):26-28.)
[9] 李文杰,化存才,何伟全,等.网络舆情事件的灰色预测模型及案例分析[J].情报科学,2013(12):51-56.(LI W J, HUA C C, HE W Q, et al. Gray prediction model of network public opinion event and analysis of examples[J]. Information Science, 2013(12):51-56.)
[10] 滕文杰.时间序列分析法在突发公共卫生事件网络舆情分析中的应用研究[J].中国卫生统计,2014,31(6):1071-1073.(TENG W J. Application of time series analysis in public opinion analysis of public health emergencies[J]. Chinese Journal of Health Statistics, 2014, 31(6):1071-1073.)
[11] CHOI B S. A recursive algorithm for solving the spatial Yule-Walker equations of causal spatial AR models[J]. Statistics & Probability Letters, 1997, 33(3):241-251.
[12] 黄远,沈乾,刘怡君.微博舆论场:突发事件舆情演化分析的新视角[J].系统工程理论与实践,2015,35(10):2564-2572.(HUANG Y, SHEN Q, LIU Y J. Microblog public opinion field:a new perspective for analyzing evolution of emergency opinion[J]. System Engineering-Theory & Practice, 2015, 35(10):2564-2572.)
[13] FRIEDMAN J, HASTIE T, TIBSHIRANI R. Additive logistic regression:a statistical view of boosting[J]. Annals of Statistics, 2000, 28(2):337-374.
[14] CHEN T, GUESTRIN C. XGBoost:a scalable tree boosting system[C]//KDD'16:Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York:ACM, 2016:785-794.
[15] ESHEL G. The Yule Walker equations for the AR coefficients[EB/OL].[2017-04-01]. http://www-stat.wharton.upenn.edu/~steele/Courses/956/ResourceDetails/YWSourceFiles/YW-Eshel.pdf.
[16] 游丹丹,陈福集.基于改进粒子群和BP神经网络的网络舆情预测研究[J].情报杂志,2016,35(8):156-161.(YOU D D, CHEN F J. Research on the prediction network public opinion based on improved PSO and BP neural network[J]. Journal of Intelligence, 2016, 35(8):156-161.)

基于经验模态分解自回归组合模型的网络舆情预测

Network public opinion prediction by empirical mode decomposition-autoregression based on extreme gradient boosting model

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

[1]	卞凌志, 王直杰. 基于增强多维多粒度级联森林的信用评分模型[J]. 计算机应用, 2021, 41(9): 2539-2544.
[2]	牛康力, 谌雨章, 沈君凤, 曾张帆, 潘永才, 王绎冲. 基于深度学习的双通道夜视图像复原方法[J]. 计算机应用, 2021, 41(6): 1775-1784.
[3]	黄梨, 卢龙. 基于长距离依赖编码与深度残差U-Net的缺血性卒中病灶分割[J]. 计算机应用, 2021, 41(6): 1820-1827.
[4]	梁敏, 王昊榕, 张瑶, 李杰. 基于加速残差网络的图像超分辨率重建方法[J]. 计算机应用, 2021, 41(5): 1438-1444.
[5]	孔伶旭, 吴海锋, 曾玉, 陆小玲. 使用深度学习和不同频率维度的脑功能性连接对轻微认知障碍的诊断[J]. 计算机应用, 2021, 41(2): 590-597.
[6]	陈朗, 王让定, 严迪群, 林昱臻. 融合残差网络和极限梯度提升的音频隐写检测模型[J]. 计算机应用, 2021, 41(2): 449-455.
[7]	孟鑫禹, 王睿涵, 张喜平, 王明杰, 丘刚, 王政霞. 基于经验模态分解与多分支神经网络的超短期风功率预测[J]. 计算机应用, 2021, 41(1): 237-242.
[8]	张晓晗, 冯爱民. 基于经验模态分解和长短期记忆神经网络的短期交通流量预测[J]. 计算机应用, 2021, 41(1): 225-230.
[9]	朱相荣, 王磊, 杨雅婷, 董瑞, 张俊. 基于非自回归方法的维汉神经机器翻译[J]. 计算机应用, 2020, 40(7): 1891-1895.
[10]	陈赛健, 朱远平. 基于生成对抗网络的文本图像联合超分辨率与去模糊方法[J]. 计算机应用, 2020, 40(3): 859-864.
[11]	郭茂祖, 张彬, 赵玲玲, 张昱. 基于联合特征和XGBoost的活动语义识别方法[J]. 计算机应用, 2020, 40(11): 3159-3165.
[12]	吴婕, 吕永乐. 基于多项式系数自回归模型的雷达性能参数最优组合预测[J]. 计算机应用, 2019, 39(4): 1117-1121.
[13]	刘子豪, 李凌, 叶枫. 基于SparkR的水文传感器数据的异常检测方法[J]. 计算机应用, 2019, 39(2): 436-440.
[14]	郁顺昌, 黄定江. 基于自回归移动平均反转的在线投资组合选择[J]. 计算机应用, 2018, 38(5): 1505-1511.
[15]	孙毅堂, 宋慧慧, 张开华, 严飞. 基于极深卷积神经网络的人脸超分辨率重建算法[J]. 计算机应用, 2018, 38(4): 1141-1145.