Journal of Computer Applications ›› 2016, Vol. 36 ›› Issue (1): 188-193.DOI: 10.11772/j.issn.1001-9081.2016.01.0188

Previous Articles     Next Articles

Data discretization algorithm based on adaptive improved particle swarm optimization

DONG Yuehua, LIU Li   

  1. College of Information Engineering, Jiangxi University of Science and Technology, Ganzhou Jiangxi 341000, China
  • Received:2015-07-10 Revised:2015-08-26 Online:2016-01-10 Published:2016-01-09

基于自适应改进粒子群优化的数据离散化算法

董跃华, 刘力   

  1. 江西理工大学 信息工程学院, 江西 赣州 341000
  • 通讯作者: 刘力(1990-),男,湖北黄冈人,硕士研究生,主要研究方向:数据挖掘
  • 作者简介:董跃华(1964-),女,河北乐亭人,副教授,硕士,主要研究方向:数据挖掘、软件工程。

Abstract: Focusing on the issue that the classical rough set can only deal with discrete attributes, a discretization algorithm based on Adaptive Hybrid Particle Swarm Optimization (AHPSO) was proposed. Firstly, the adaptive adjustment strategy was introduced, which could not only overcome the shortage that the particle swarm was easy to fall into local extremum but also improve the ability of seeking the global excellent result. Secondly, the Tabu Search (TS) method was introduced to deal with the global optimal particle of each generation and to get the best global optimal particle, which enhanced the local search ability of particle swarm. Finally, the attribute discretization points were initialized to the particle group when the classification ability of the decision table had been kept. The optimal discretization points were sought through the interaction between particles. By using the classification method of J48 decision tree based on WEKA (Waikato Environment for Knowledge Analysis) platform, compared with the discretization algorithms based on importance of attribute and information entropy, the classification accuracy of the proposed algorithm improved by about 10% to 20%.Compared with the discretization algorithms based on Niche Discrete PSO (NDPSO) and linearly decreasing weight PSO, the classification accuracy of the proposed algorithm improved by about 2% to 5%. The experimental results show that the proposed algorithm significantly enhances the accuracy of classification by J48 decision tree, and it has better validity for discretization of continuous attributes.

Key words: classical rough set, self-adaption, Particle Swarm Optimization (PSO), discretization, Tabu Search (TS)

摘要: 针对经典粗糙集只能处理离散型属性的问题,提出一种基于自适应混合粒子群优化(AHPSO)的离散化算法。首先,引入自适应调整策略,以克服粒子群易陷入局部解的缺点,提高了粒子群全局寻优能力;然后对每一代全局最优粒子进行禁忌搜索(TS),得到当代最佳全局最优粒子,增强了粒子群局部搜索能力;最后,在保持决策表分类能力不变的情况下,将属性离散化分割点初始化为粒子群体,通过粒子间的相互作用得到最佳的离散化分割点。使用WEKA平台上的J48决策树分类方法,与基于属性重要度、信息熵的离散化算法相比,该算法的分类精度提升了10%~20%;与基于小生境离散粒子群优化(NDPSO)、参数线性递减粒子群的离散化算法相比,该算法的分类精度提升了2%~5%。实验结果表明,该算法显著地提高了J48决策树的分类学习精度,在对数据离散化时也有较好的性能。

关键词: 经典粗糙集, 自适应, 粒子群优化, 离散化, 禁忌搜索

CLC Number: