Journal of Computer Applications ›› 2011, Vol. 31 ›› Issue (05): 1339-1343.DOI: 10.3724/SP.J.1087.2011.01339
• Database technology • Previous Articles Next Articles
QIAN Xue-zhong, HUI Liang
Received:
Revised:
Online:
Published:
钱雪忠,惠亮
通讯作者:
作者简介:
基金资助:
江苏省自然科学基金资助项目(BK20003017)。
Abstract: These algorithms based on FP-tree, for mining maximal frequent pattern, have high performance but with many drawbacks. For example, they must recursively generate conditional FP-trees and many candidate maximum frequent itemsets. In order to overcome these drawbacks of the existing algorithms, an algorithm named Based on Dimensionality Reduction of Frequent Itemset (BDRFI) for mining maximal frequent patterns was put forward after the analysis of FPMax and DMFIA algorithms. The new algorithm was based on decreasing dimension of itemset. In order to enhance efficiency of superset checking, the algorithm used Digital Frequent Pattern Tree (DFP-tree) instead of FP-tree, and reduced the number of mining through prediction and pruning before mining. During the mining process, a strategy of decreasing dimension of frequent itemset was used to generate candidate frequent itemsets. The method not only reduced the number of candidate frequent itemsets but also can avoid creating conditional FP-tree separately and recursively. The experimental results show that the efficiency of BDRFI is 2-8 times as much as that of other similar algorithms.
Key words: association rule, data mining, maximum frequent itemset, Frequent Pattern tree (FP-tree), decreasing dimension
摘要: 基于FP-tree的最大频繁模式挖掘算法是目前较为高效的频繁模式挖掘算法,针对这些算法需要递归生成条件FP-tree、产生大量候选最大频繁项集等问题,在分析FPMax、DMFIA算法的基础上,提出基于降维的最大频繁模式挖掘算法(BDRFI)。该算法改传统的FP-tree为数字频繁模式树DFP-tree,提高了超集检验的效率;采用的预测剪枝策略减少了挖掘的次数;基于降低项集维度的挖掘方式,减少了候选项的数目,避免了递归地产生条件频繁模式树,提高了算法的效率。实验结果表明,BDRFI的效率是同类算法的2~8倍。
关键词: 关联规则, 数据挖掘, 最大频繁项集, 频繁模式树, 降维
QIAN Xue-zhong, HUI Liang. Algorithm for mining maximum frequent itemsets based on decreasing dimension of frequent itemset in association rules[J]. Journal of Computer Applications, 2011, 31(05): 1339-1343.
钱雪忠 惠亮. 关联规则中基于降维的最大频繁模式挖掘算法[J]. 计算机应用, 2011, 31(05): 1339-1343.
0 / Recommend
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.3724/SP.J.1087.2011.01339
https://www.joca.cn/EN/Y2011/V31/I05/1339