计算机应用 ›› 2011, Vol. 31 ›› Issue (10): 2778-2781.DOI: 10.3724/SP.J.1087.2011.02778

• 人工智能 • 上一篇    下一篇

基于全局最近邻的离群点检测算法

胡云1,2,施珺1,王崇骏2,李慧1   

  1. 1.淮海工学院 计算机工程学院,江苏 连云港 222000
    2.南京大学 计算机科学与技术系, 南京 210000
  • 收稿日期:2011-05-06 修回日期:2011-06-12 发布日期:2011-10-11 出版日期:2011-10-01
  • 通讯作者: 李慧
  • 作者简介:胡云(1978-),女,江苏连云港人,讲师,博士研究生,CCF会员,主要研究方向:智能信息处理、数据挖掘;施珺(1963-),女,安徽桐城人,副教授,主要研究方向:数据处理、数据挖掘;王崇骏(1975-),男,江苏盱眙人,副教授,主要研究方向:分布式人工智能、智能信息处理;李慧(1979-),女,江苏连云港人,讲师,博士研究生,主要研究方向:数据挖掘、信息检索。
  • 基金资助:

    江苏省自然科学基金资助项目(BK2008190)

Outlier detection algorithm based on global nearest neighborhood

HU Yun1,2, SHI Jun1, WANG Chong-jun2, LI Hui1   

  1. 1.School of Computer Engineering, Huaihai Institute of Technology, Lianyungang Jiangsu 222000, China
    2.Department of Computer Science and Technology, Nanjing University, Nanjing Jiangsu 210000, China
  • Received:2011-05-06 Revised:2011-06-12 Online:2011-10-11 Published:2011-10-01

摘要: 针对全局最近邻离群点检测算法的效率问题,为了能够在数据集中快速准确地检测离群点,运用属性约简技术,将离群点的搜索简约到较小的最具代表性的属性子空间中进行,从而有效降低属性空间搜索的复杂度。在此基础上,通过计算基于近邻的加权离群因子实现离群点的检测并提出了相应的算法。实验表明,该离群点算法具有较好的适应性和有效性。

关键词: 离群点检测, 最近邻, 属性约简

Abstract: Traditional outlier detection algorithms fall short in efficiency for their holistic nearest neighboring search mechanism and need to be improved. This paper proposed a new outlier detection method using attribute reduction techniques which enabled the algorithm to focus its detecting scope only on the most meaningful attributes of the data space. Under the reduced set of attributes, a concept of neighborhood-based outlier factor was defined for the algorithm to judge data's abnormity. The combined strategy can reduce the searching complexity significantly and find more reasonable outliers in dataset. The results of experiments also demonstrate promising adaptability and effectiveness of the proposed approach.

Key words: outlier detection, nearest neighborhood, attribute reduction

中图分类号: