计算机应用 ›› 2009, Vol. 29 ›› Issue (08): 2253-2253.

• 人工智能 • 上一篇    下一篇

由贪心策略构造Chebyshev多项式概要

李方圆1,何海江2   

  1. 1. 长沙大学
    2. 长沙学院
  • 收稿日期:2009-02-12 修回日期:2009-03-25 发布日期:2008-08-01 出版日期:2009-08-01
  • 通讯作者: 李方圆
  • 基金资助:
    60873016;国家级基金

Constructing Chebyshev polynomial synopses with greedy strategy

  • Received:2009-02-12 Revised:2009-03-25 Online:2008-08-01 Published:2009-08-01

摘要: 基于Chebyshev多项式的概要能有效估计数据库关系属性的频度分布。然而,从M个Chebyshev系数选择最近似原始频度分布的N(N>M)个系数,是NP难问题。依据贪心策略,提出了三种概要构造算法,精度最高的一个称为GreedyB。 GreedyB先找出2N个绝对值最大的系数,再由贪心策略剔除多余的N个。在模拟数据序列和实际数据序列的实验数据表明,GreedyB尽管时间复杂度要高,但L1、L2、L∞等误差显著较小。

关键词: 贪心策略, Chebyshev多项式, 概要, 数据库关系属性, 频度分布, greedy strategy, Chebyshev polynomial, synopsis, database relation attribute, frequency distribution

Abstract: The synopses based Chebyshev polynomials can be efficiently used to estimate frequency distribution of attributes of a given database relation. However, it is a NPhard problem to choose N coefficients most approximate to the origin frequency distribution from M (M>N)Chebyshev polynomial coefficients. Three greedy algorithms were introduced, and the best one was named GreedyB. First, 2N largest absolute value coefficients were picked out, then, redundant N coefficients were eliminated by greedy strategy. Experimental results over synthetic and real data sequences clearly show that compared with the existing algorithms, despite GreedyB has higher time complexity, it achieves higher accuracy as measured by the L1,L2 and L_inf error metrics.

中图分类号: