计算机应用 ›› 2010, Vol. 30 ›› Issue (05): 1284-1286.

• 数据挖掘与人工智能 • 上一篇    下一篇

动态数据环境下基于信息熵的相对离群点检测算法

孙浩1,何晓红2   

  1. 1. 重庆邮电大学
    2. 重庆邮电大学 生物信息学院
  • 收稿日期:2009-12-08 修回日期:2010-01-11 发布日期:2010-05-04 出版日期:2010-05-01
  • 通讯作者: 孙浩
  • 基金资助:
    重庆邮电大学自然科学基金资助项目

Entropy-based algorithm to detect relative outliers in dynamic environment

  • Received:2009-12-08 Revised:2010-01-11 Online:2010-05-04 Published:2010-05-01

摘要: 在基于信息熵的离群点检测算法的基础上,提出一种适用于动态数据环境的检测算法。该算法在有数据对象插入或删除的时候,不必计算所有数据对象的相对离群点因子(ROF)值,而只需重新计算受影响的点的ROF值。实验结果表明,该算法在动态数据环境下的运行时间小于原来的算法。

关键词: 动态数据环境, 信息熵, 离群点检测, 局部离群因子

Abstract: An algorithm for detecting relative outliers in dynamic environment based on information entropy was proposed. When an object was inserted into or deleted from the dataset, the algorithm made it unnecessary to compute the values of Relative Outlier Factor (ROF) for all objects in dataset, only need to compute for affected objects. The experimental results indicate that the running time of this algorithm is less than that of the original algorithm in dynamic environment.

Key words: dynamic environment, entropy, outlier detection, Local Outlier Factor (LOF)