计算机应用

• 数据库与数据挖掘 • 上一篇    下一篇

基于支持度与置信度阈值优化技术的关联分类算法

张健 王蔚   

  1. 南京师范大学教育技术系机器学习与认知实验室 南方师范大学教育技术系
  • 收稿日期:2007-06-25 修回日期:2007-08-18 发布日期:2007-12-01 出版日期:2007-12-01
  • 通讯作者: 张健

Associative classification algorithm based on support and confident thresholds tuning technique

<a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=(((Jian Zhang[Author]) AND 1[Journal]) AND year[Order])" target="_blank">Jian Zhang</a>   

  • Received:2007-06-25 Revised:2007-08-18 Online:2007-12-01 Published:2007-12-01
  • Contact: Jian Zhang

摘要: 基于关联规则的分类算法中,支持度和置信度阈值的设置会影响分类器的准确率。以往的关联分类算法都根据经验人为地设置支持度和置信度的阈值,很难保证分类器总能达到较好的分类效果。为了解决该问题,可以将优化求解策略引入到关联分类过程中。通过利用爬山法搜索技术来获得使分类准确率最高的支持度与置信度阈值,对Apriori_TFP_CMAR关联分类算法进行改进,避免了阈值设置不合理影响最终分类效果的问题,提高了关联分类算法的分类准确率。

关键词: 关联分类, 支持度阈值, 置信度阈值, Apriori_TFP, 爬山法

Abstract: The set of the support and confident thresholds usually affects the accuracy of classification based on association rules. As for the previous associative classification algorithms, the two thresholds are always set by experiences, so it is difficult to ensure that the classifier can always get the best accuracy. In order to solve this problem, the optimization strategies can be introduced to associative classification algorithm. The hill climbing search method was used to improve the Apriori_TFP_CMAR algorithm to get the highest classification accuracy of the set of support and confidence thresholds. This strategy can avoid the unreasonable set of threshold, and enhance the classification accuracy.

Key words: associative classification, support threshold, confident threshold, Apriori_TFP, hill climbing

中图分类号: