计算机应用 ›› 2017, Vol. 37 ›› Issue (2): 587-592.DOI: 10.11772/j.issn.1001-9081.2017.02.0587

• 计算机软件技术 • 上一篇    下一篇

基于稳态过程的多重分形Web日志仿真生成算法

彭行雄1,2, 肖如良1,2   

  1. 1. 福建师范大学 软件学院, 福州 350117;
    2. 福建省公共服务大数据挖掘与应用工程技术研究中心, 福州 350117
  • 收稿日期:2016-06-14 修回日期:2016-08-18 出版日期:2017-02-10 发布日期:2017-02-11
  • 通讯作者: 肖如良,xiaoruliang@163.com
  • 作者简介:彭行雄(1991-),男,湖北孝感人,硕士研究生,主要研究方向:机器学习;肖如良(1966-),男,湖南娄底人,教授,博士,CCF高级会员,主要研究方向:软件工程、大数据软件新技术。
  • 基金资助:
    福建省高校产学合作项目(2016H6007)。

Multi-fractal Web log simulation generation algorithm based on stable process

PENG Xingxiong1,2, XIAO Ruliang1,2   

  1. 1. Faculty of Software, Fujian Normal University, Fuzhou Fujian 350117, China;
    2. Fujian Provincial Engineering Research Center of Public Service Big Data Analysis and Application, Fuzhou Fujian 350117, China
  • Received:2016-06-14 Revised:2016-08-18 Online:2017-02-10 Published:2017-02-11
  • Supported by:
    This work is partially supported by the Fujian Provincial Great Plan Project (2016H6007).

摘要: 运行在服务器集群的软件系统需要Web日志的大规模数据集以满足性能测试的需求,但现有仿真生成算法因模型单一而无法满足要求。针对此问题,提出一种基于alpha稳态过程的多分形Web日志的仿真生成算法。首先,在长相关尺度(LRD)下采用alpha稳态过程来描述Web日志的自相似性;其次,在短相关尺度(RSD)下采用二项式b模型描述Web日志的多重分形性;最后,将长相关模型和短相关模型融合于改进的ON/OFF框架中。与单一的模型相比,新算法的参数物理意义明确,具有良好的自相似性和多分形性。实验结果表明,该算法能够较准确地模拟真实Web日志,可以有效地应用于Web日志大规模数据集的仿真生成。

关键词: 稳态过程, 多重分形, 自相似, 时间序列, 日志分析, 仿真生成

Abstract: The software system running on the server cluster needs large-scale data sets of Web log to meet the performance test requirement, but the existing simulation generation algorithm cannot meet the requirements due to the single model. Aiming at this problem, a new multi-fractal Web log simulation generation algorithm based on alpha stable process was proposed. Firstly, the self-similarity of Web log was described by alpha stable process in Long Range Dependence (LRD). Secondly, the multi-fractal of Web log was described by binomial-b model in Short Range Dependence (SRD). Finally, the model of long range dependence and the model of short range dependence were integrated into the improved ON/OFF framework. Compared with the single model, the parameters of the proposed algorithm has clear physical meaning equipped with good performance of self-similarity and multi-fractal. The experimental results show that the proposed algorithm can accurately simulate the real Web log and be effectively applied in Web log simulation generation with large-scale data sets.

Key words: stable process, multi-fractal, self-similarity, time series, log analysis, simulation generation

中图分类号: