计算机应用 ›› 2014, Vol. 34 ›› Issue (2): 542-545.

• 人工智能 • 上一篇    下一篇

基于交叠数据窗距离测度概念漂移检测新方法

刘茂,张东波,赵圆圆   

  1. 湘潭大学 信息工程学院,湖南 湘潭 411105
  • 收稿日期:2013-07-17 修回日期:2013-09-06 出版日期:2014-02-01 发布日期:2014-03-01
  • 通讯作者: 刘茂
  • 作者简介:刘茂(1987-),男,湖南永州人,硕士研究生,主要研究方向:概念漂移、集成学习;张东波(1973-),湖南隆回人,教授,博士,主要研究方向:模式识别、图像处理、集成学习、智能信息处理;赵圆圆(1988-),女,湖南邵阳人,硕士研究生,主要研究方向:模式识别、图像处理。
  • 基金资助:
    国家自然科学基金资助项目;湖南省教育厅资助科研项目

Concept drift detection based on distance measurement of overlapped data windows

LIU Mao,ZHANG Dongbo,ZHAO Yuanyuan   

  1. College of Information Engineering, Xiangtan University, Xiangtan Hunan 411105, China
  • Received:2013-07-17 Revised:2013-09-06 Online:2014-02-01 Published:2014-03-01
  • Contact: LIU Mao

摘要: 针对数据流中的概念漂移检测存在错误检测、延迟检测等问题,提出了一种基于交叠数据窗距离测度的在线概念漂移检测方法。通过将数据流划分成大小相等且交叠的数据窗并计算相邻交叠数据窗异构欧氏距离,同时利用近邻原则判别数据窗中样本不一致程度,从而实现分布差异性评价和漂移的检测。为评价该方法的有效性,在具有不同漂移严重程度和漂移速度的公开数据集上进行了实验,实验结果表明:该方法能够准确快速地检测到不同类型的概念漂移且能够找出概念漂移发生的具体位置。

关键词: 概念漂移, 数据流, 异构欧氏距离, 交叠数据窗

Abstract: To solve the false detection and detection delay of concept drift for data stream, a new online concept drift detection method based on the distance measurement of overlapped data windows was proposed in this paper. By dividing the data stream into overlapped data windows and computing the heterogeneous Euclidean distance of neighboring windows, and measuring the inconsistency of the data windows through the nearest neighbor principle, the authors could achieve the evaluation of distribution diversity and the detection of concept drift. To evaluate the effectiveness of the proposed method, experiments were made on some public data sets with different drift severity and drift speed. The experimental results show that the proposed method can detect different types of concept drift quickly and accurately and can figure out the locations where concept drift appeared. Key words: concept drift; data stream; heterogeneous Euclidean distance; overlap data windows

Key words: concept drift, data stream, heterogeneous Euclidean distance, overlap data window

中图分类号: