计算机应用 ›› 2013, Vol. 33 ›› Issue (08): 2184-2187.

• 数据库技术 • 上一篇    下一篇

基于lazy方法的数量型关联分类

李学明1,李宾飞1,杨涛2,吴海燕1   

  1. 1. 重庆大学 计算机学院,重庆 400044;
    2. 中国科学技术大学 软件学院,合肥 230027
  • 收稿日期:2013-02-07 修回日期:2013-04-16 出版日期:2013-08-01 发布日期:2013-09-11
  • 通讯作者: 李宾飞
  • 作者简介:李学明(1962-),男,重庆人,教授,博士,主要研究方向:数据挖掘、计算机网络;
    李宾飞(1989-),女,河南安阳人,硕士研究生,主要研究方向:数据挖掘;
    杨涛 (1989-),男,河南信阳人,硕士研究生,主要研究方向:数据挖掘、计算机网络;
    付萌 (1988-),女,河北保定人,硕士研究生,主要研究方向:数据挖掘
  • 基金资助:

    国家自然科学基金资助项目;重庆市高等教育教学改革研究重点资助项目;中央高校基本科研业务基金资助项目;“211工程”三期建设资助项目

Quantitative associative classification based on lazy method

LI Xueming1,LI Binfei1,YANG Tao2,WU Haiyan1   

  1. 1. College of Computer Science, Chongqing University, Chongqing 400044, China
    2. School of Software Engineering, University of Science and Technology of China, Hefei Anhui 230027, China
  • Received:2013-02-07 Revised:2013-04-16 Online:2013-09-11 Published:2013-08-01
  • Contact: LI Binfei

摘要: 传统关联分类方法处理数量型数据时,“先离散,再学习”的步骤使新的测试样例可能无法找到合适的离散区间,形成离散盲目性问题。基于lazy的数量型关联分类作为一种新的关联分类法,它首先利用K-近邻分类思想为测试样例求得K-近邻作为新的训练数据集,然后对包含测试样例和K个近邻的数据集离散化,并在K-近邻组成的离散数据集上挖掘关联规则并构造分类器进行分类。最后,通过与传统CBA、CMAR、CPAR算法在7个常用UCI数量型数据集上进行的对比实验结果表明,基于lazy的数量型关联分类方法的平均分类准确率提高了0.66%~1.65%,证明了该方法的可行性。

关键词: 数据挖掘, lazy方法, 数量型关联分类, 关联规则, K-近邻

Abstract: In order to avoid the problem of blind discretization of traditional classification "discretize first learn second", a new method of associative classification based on lazy thought was proposed. It discretized the new training dataset gotten by determining the K-nearest neighbors of test instance firstly, and then mined associative rules form the discrete dataset and built a classifier for predicting the class label of test instance. At last, the results of contrastive experiments with CBA (Classification Based on Associations), CMAR (Classification based on Multiple Class-Association Rules) and CPAR (Classification based on Predictive Association Rules) carried out on seven commonly used quantitative datasets of UCI show that the classification accuracy of the proposed method can be increased by 0.66% to 1.65%, and verify the feasibility of this method.

Key words: data mining, lazy method, quantitative associative classification, associative rule, K-Nearest Neighbors (KNN)

中图分类号: