计算机应用 ›› 2014, Vol. 34 ›› Issue (7): 1862-1866.DOI: 10.11772/j.issn.1001-9081.2014.07.1862

• 先进计算 • 上一篇    下一篇

基于MapReduce的多元线性回归预测模型

代亮1,2,许宏科1,2,陈婷2,3,钱超1,2,粱殿鹏4   

  1. 1. 长安大学 电子与控制工程学院,西安 710064;
    2. 陕西省道路交通智能检测与装备工程技术研究中心,西安 710064;
    3. 长安大学 信息工程学院, 西安 710064
    4. IBM中国系统与科技开发中心,西安 710068
  • 收稿日期:2014-01-17 修回日期:2014-03-16 出版日期:2014-07-01 发布日期:2014-08-01
  • 通讯作者: 代亮
  • 作者简介:代亮(1981-),男,陕西西安人,讲师,博士,CCF会员,主要研究方向:并行计算、海量数据并行处理;许宏科(1963-),男,陕西宝鸡人,教授,博士生导师,博士,主要研究方向:智能交通系统;陈婷(1982-),女,陕西西安人,讲师, 博士,主要研究方向:分布式系统、并行计算;钱超(1984-),男,江苏徐州人,讲师,博士,主要研究方向:交通数据挖掘与分析;梁殿鹏(1977-),男,甘肃武威人,高级工程师,硕士,主要研究方向:并行计算、负载均衡。
  • 基金资助:

    国家自然科学基金资助项目;教育部创新团队发展计划项目;交通运输部基础研究项目;陕西省自然科学基础研究计划项目;中央高校基本科研业务费专项资金资助项目;中国博士后科学基金面上资助项目

Multivariate linear regression forecasting model based on MapReduce

DAI Liang1,2,XU Hongke1,2,CHEN Ting2,3,QIAN Chao1,2,LIANG Dianpeng4   

  1. 1. School of Electronic and Control Engineering, Chang'an University, Xi'an Shaanxi 710064, China;
    2. Shaanxi Road Traffic Detection and Equipment Engineering Research Center, Xi'an Shaanxi 710064,China;
    3. School of Information Engineering, Chang'an University, Xi'an Shaanxi 710064, China;
    4. IBM China Systems and Technology Laboratory, Xi'an Shaanxi 710068, China
  • Received:2014-01-17 Revised:2014-03-16 Online:2014-07-01 Published:2014-08-01
  • Contact: DAI Liang

摘要:

针对传统的多元线性回归预测方法处理时间长且受内存限制的特点,对时序样本数据设计了基于MapReduce的并行多元线性回归预测模型。模型由三组MapReduce过程组成,分别求解由历史数据所构成叉积矩阵的特征向量和标准正交特征向量,用来预测未来参数的特征值和特征向量矩阵和未来时刻回归参数的估计量。设计并实现了实验来验证提出的并行多元线性回归预测模型的有效性。实验结果表明,基于MapReduce的多元线性回归预测模型具有较好的加速比和可扩展性,适合于大规模时序数据的分析和预测。

Abstract:

According to the characteristics of traditional multivariate linear regression method for long processing time and limited memory, a parallel multivariate linear regression forecasting model was designed based on MapReduce for the time-series sample data. The model was composed of three MapReduce processes which were used to solve the eigenvector and standard orthogonal vector of cross product matrix composed by historical data, to forecast the future parameter of the eigenvalues and eigenvectors matrix, and to estimate the regression parameters in the next moment respectively. Experiments were designed and implemented to the validity effectiveness of the proposed parallel multivariate linear regression forecasting model. The experimental results show multivariate linear regression prediction model based on MapReduce has good speedup and scaleup, and suits for analysis and forecasting of large data.

中图分类号: