计算机应用 ›› 2009, Vol. 29 ›› Issue (12): 3300-3302.

• 数据库与数据挖掘 • 上一篇    下一篇

快速的基于单元格的离群数据挖掘算法

崔贯勋1,李梁2,王勇2,倪伟2,黄丽丰2   

  1. 1. 重庆理工大学
    2.
  • 收稿日期:2009-06-22 修回日期:2009-08-18 发布日期:2009-12-10 出版日期:2009-12-01
  • 通讯作者: 崔贯勋
  • 基金资助:
    重庆市科技攻关计划项目

Fast outliers mining algorithm based on unit cell

  • Received:2009-06-22 Revised:2009-08-18 Online:2009-12-10 Published:2009-12-01

摘要: 针对数据集中离群数据的挖掘速度的问题,提出了快速的基于单元格的离群数据挖掘算法。该算法根据网格的特点首先将数据划分成若干空间单元,从而减少区域查询次数,提高离群数据的挖掘速度,然后根据单元格的阈值来判定一个数据是否为离群数据。通过数据测试表明,该算法能够快速有效地挖掘出数据集中的离群数据。

关键词: 数据挖掘, 离群数据, 单元格, 邻居单元

Abstract: The speed of mining outliers from dataset is slow. According to the characteristic of grid, fast outliers mining algorithm was proposed by partitioning the data into a set of units cell firstly. Therefore, the execution frequency of region query decreased and then the speed increased. According to the appointed thresholds, whether the data was outlier or not was decided. Therefore, the algorithm can deal with the large scalability of points. Experimental results show that the algorithm is fast and effective.

Key words: data mining, Outlier, unit cell, neighbour unit