计算机应用 ›› 2005, Vol. 25 ›› Issue (05): 998-1000.DOI: 10.3724/SP.J.1087.2005.0998

• 数据挖掘 • 上一篇    下一篇

一种基于FP-tree的最大频繁项目集挖掘算法

 刘乃丽,李玉忱,马磊   

  1. 山东大学计算机科学与技术学院
  • 发布日期:2005-05-01 出版日期:2005-05-01

Algorithm for mining maximum frequent itemsets based on FP-tree

LIU Nai-li, LI Yu-chen, MA Lei   

  1. School of Computer Science & Technology, Shandong University
  • Online:2005-05-01 Published:2005-05-01

摘要: 挖掘关联规则是数据挖掘领域中的重要研究内容,其中挖掘最大频繁项目集是挖掘关联规则中的关键问题之一,以前的许多挖掘最大频繁项目集算法是先生成候选,再进行检验,然而候选项目集产生的代价是很高的,尤其是存在大量长模式的时候。文中改进了FP 树结构,提出了一种基于FP tree的快速挖掘最大频繁项目集的算法DMFIA 1,该算法不需要生成最大频繁候选项目集,比DMFIA算法挖掘最大频繁项目集的效率更高。改进的FP 树是单向的,每个结点只保留指向父结点的指针,这大约节省了三分之一的树空间。

关键词: 数据挖掘, 最大频繁项目集, 关联规则, 频繁模式树

Abstract: Mining association rule is an important matter in data mining, in which mining maximum frequent itemsets is a key problem in mining association rule. Many of the previous algorithms mine maximum frequent itemsets by producing candidate itemsets firstly, then pruning. But the cost of producing candidate itemsets is very high, especially when there exist long patterns. In this paper, the structure of a FP-tree was improved, a fast algorithm DMFIA-1 based on FP-tree for mining maximum frequent itemsets was proposed, which did not produce maximum frequent candidate itemsets and was more effective than DMFIA. The new FP-tree is a one-way tree and there is no pointer pointing its children in each node, so at least one third of memory is saved.

Key words: data mining, maximum frequent itemset, association rule, FP-tree

中图分类号: