计算机应用 ›› 2009, Vol. 29 ›› Issue (09): 2502-2504.

• 数据库与知识工程 • 上一篇    下一篇

基于马氏距离和灰色分析的缺失值填充算法

刘星毅   

  1. 钦州学院
  • 收稿日期:2009-03-23 修回日期:2009-05-12 发布日期:2009-11-10 出版日期:2009-09-01
  • 通讯作者: 刘星毅
  • 基金资助:
    广西自然科学基金(桂科自0899018);广西教育厅科研项目(200808MS062);其他

Improved kNN algorithm based on Mahalanobis distance and gray analysis

刘星毅 LIU Xing-Yi   

  • Received:2009-03-23 Revised:2009-05-12 Online:2009-11-10 Published:2009-09-01
  • Contact: 刘星毅 LIU Xing-Yi

摘要: 针对kNN算法中欧氏距离具有密度相关性敏感的缺点,提出综合马氏距离和灰色分析方法代替kNN算法中欧式距离的新算法,应用到缺失数据填充方面。其中马氏距离能解决密度相关明显的数据集,灰色分析方法能处理密度相关不明显的情况。因此,该算法能很好处理任何数据集,实验结果显示,算法在填充结果上明显优于现有的其他算法。

关键词: 数据预处理, 缺失数据, 最近邻算法, 灰色分析, 马氏距离

Abstract: The Euclidean-based k-Nearest Neighbor (kNN) algorithm is restricted to the dataset without correlation-sensitive on density. The author proposed an improved kNN algorithm based on Mahalanobis distance and gray analysis for imputing missing data to replace the existing Euclidean distance. The Mahalanobis distances can deal with the issue of correlation-sensitive on density, and the gray-analysis method can deal with the opposite case. Hence, the proposed method can deal with any kind of datasets, and the experimental results show the proposed method outperforms the existing algorithms.

Key words: data preprocessing, missing data, Nearest Neighbor (NN) algorithm, gray analysis, Mahalanobis distance

中图分类号: