计算机应用 ›› 2012, Vol. 32 ›› Issue (11): 3034-3037.DOI: 10.3724/SP.J.1087.2012.03034

• 人工智能 • 上一篇    下一篇

基于Hubness的类别均衡的时间序列实例选择算法

翟婷婷,何振峰   

  1. 福州大学 数学与计算机科学学院,福州 350108
  • 收稿日期:2012-05-31 修回日期:2012-07-06 发布日期:2012-11-12 出版日期:2012-11-01
  • 通讯作者: 翟婷婷
  • 作者简介:翟婷婷(1988-),女,河南济源人,硕士研究生,主要研究方向:数据挖掘; 何振峰(1971-),男,安徽石台人,副教授,博士,主要研究方向:机器学习、编译器优化。

Instance selection algorithms of balanced class distribution based on Hubness for time series

ZHAI Ting-ting,HE Zhen-feng   

  1. School of Mathematics and Computer Science, Fuzhou University, Fuzhou Fujian 350108, China
  • Received:2012-05-31 Revised:2012-07-06 Online:2012-11-12 Published:2012-11-01
  • Contact: ZHAI Ting-ting

摘要: 针对实例选择算法INSIGHT存在选出的实例类别分布不均衡和得分相等的实例的重要性无法区分两个问题,分别提出了改进算法。改进算法BINSIGHT1基于分治思想,通过筛选出训练集各类中最具有代表性的实例,来确保选出的实例类别分布尽可能均衡。改进算法BINSIGHT2将改进算法BINSIGHT1的单重排序改进成了双重排序,以便更有效地衡量实例的重要性。实验结果表明,在时间复杂度基本不变的前提下,所提算法在分类准确率上均优于INSIGHT算法。

关键词: 实例选择, Hubness, 类别均衡, 时间序列, 分类

Abstract: In order to solve the imbalanced class distribution in the selected instances of INSIGHT algorithm and that the importance of instances with the same score cannot be distinguished, two improved algorithms were proposed respectively. The first one based on the divide and conquer strategy achieved balanced class distribution by choosing the most representative instances from every class of the training set. The second one adapted single sorting of the first one to double sorting so that it could measure the importance of different instances effectively. The experimental results show that the proposed algorithms outperform INSIGHT in terms of the classification accuracy under the condition of basically the same time complexity.

Key words: instance selection, Hubness, balanced class, time series, classification

中图分类号: