Algorithm for mining maximum frequent itemsets based on decreasing dimension of frequent itemset in association rules

doi:10.3724/SP.J.1087.2011.01339

Journal of Computer Applications ›› 2011, Vol. 31 ›› Issue (05): 1339-1343.DOI: 10.3724/SP.J.1087.2011.01339

• Database technology • Previous Articles Next Articles

Algorithm for mining maximum frequent itemsets based on decreasing dimension of frequent itemset in association rules

QIAN Xue-zhong, HUI Liang

School of Internet of Things Engineering,Jiangnan University, Wuxi Jiangsu 214122, China

Received:2010-10-11 Revised:2010-11-14 Online:2011-05-01 Published:2011-05-01

关联规则中基于降维的最大频繁模式挖掘算法

钱雪忠,惠亮

江南大学物联网工程学院,江苏无锡 214122

通讯作者: 惠亮
作者简介:钱雪忠(1967-),男,江苏无锡人,副教授,硕士,主要研究方向:数据库、数据挖掘、网络安全;惠亮(1983-),男,江苏邳州人,硕士研究生,主要研究方向:数据库、数据挖掘。
基金资助:
江苏省自然科学基金资助项目(BK20003017)。

Abstract

Abstract: These algorithms based on FP-tree, for mining maximal frequent pattern, have high performance but with many drawbacks. For example, they must recursively generate conditional FP-trees and many candidate maximum frequent itemsets. In order to overcome these drawbacks of the existing algorithms, an algorithm named Based on Dimensionality Reduction of Frequent Itemset (BDRFI) for mining maximal frequent patterns was put forward after the analysis of FPMax and DMFIA algorithms. The new algorithm was based on decreasing dimension of itemset. In order to enhance efficiency of superset checking, the algorithm used Digital Frequent Pattern Tree (DFP-tree) instead of FP-tree, and reduced the number of mining through prediction and pruning before mining. During the mining process, a strategy of decreasing dimension of frequent itemset was used to generate candidate frequent itemsets. The method not only reduced the number of candidate frequent itemsets but also can avoid creating conditional FP-tree separately and recursively. The experimental results show that the efficiency of BDRFI is 2-8 times as much as that of other similar algorithms.

Key words: association rule, data mining, maximum frequent itemset, Frequent Pattern tree (FP-tree), decreasing dimension

摘要： 基于FP-tree的最大频繁模式挖掘算法是目前较为高效的频繁模式挖掘算法,针对这些算法需要递归生成条件FP-tree、产生大量候选最大频繁项集等问题,在分析FPMax、DMFIA算法的基础上,提出基于降维的最大频繁模式挖掘算法(BDRFI)。该算法改传统的FP-tree为数字频繁模式树DFP-tree,提高了超集检验的效率;采用的预测剪枝策略减少了挖掘的次数;基于降低项集维度的挖掘方式,减少了候选项的数目,避免了递归地产生条件频繁模式树,提高了算法的效率。实验结果表明,BDRFI的效率是同类算法的2~8倍。

关键词: 关联规则, 数据挖掘, 最大频繁项集, 频繁模式树, 降维

QIAN Xue-zhong, HUI Liang. Algorithm for mining maximum frequent itemsets based on decreasing dimension of frequent itemset in association rules[J]. Journal of Computer Applications, 2011, 31(05): 1339-1343.

钱雪忠惠亮. 关联规则中基于降维的最大频繁模式挖掘算法[J]. 计算机应用, 2011, 31(05): 1339-1343.

[1]	Huanhuan LI, Tianqiang HUANG, Xuemei DING, Haifeng LUO, Liqing HUANG. Public traffic demand prediction based on multi-scale spatial-temporal graph convolutional network [J]. Journal of Computer Applications, 2024, 44(7): 2065-2072.
[2]	Yao DONG, Yixue FU, Yongfeng DONG, Jin SHI, Chen CHEN. Survey of incomplete multi-view clustering [J]. Journal of Computer Applications, 2024, 44(6): 1673-1682.
[3]	Keshuai YANG, Youxi WU, Meng GENG, Jingyu LIU, Yan LI. Top-k high average utility sequential pattern mining algorithm under one-off condition [J]. Journal of Computer Applications, 2024, 44(2): 477-484.
[4]	Haodong ZHENG, Hua MA, Yingchao XIE, Wensheng TANG. Knowledge tracing model based on graph neural network blending with forgetting factors and memory gate [J]. Journal of Computer Applications, 2023, 43(9): 2747-2752.
[5]	Shuo HUANG, Yanhui LI, Jianqiu CAO. PrivSPM： frequent sequential pattern mining algorithm under local differential privacy [J]. Journal of Computer Applications, 2023, 43(7): 2057-2064.
[6]	Hua JIANG, Xing LI, Huijiao WANG, Jinghai WEI. Cross-level high utility itemsets mining algorithm based on data index structure [J]. Journal of Computer Applications, 2023, 43(7): 2200-2208.
[7]	Chaoshuai QI, Wensi HE, Yi JIAO, Yinghong MA, Wei CAI, Suping REN. Survey on anomaly detection algorithms for unmanned aerial vehicle flight data [J]. Journal of Computer Applications, 2023, 43(6): 1833-1841.
[8]	Yuanjiang LI, Jinsheng QUAN, Yangyi TAN, Tian YANG. Attribute reduction for high-dimensional data based on bi-view of similarity and difference [J]. Journal of Computer Applications, 2023, 43(5): 1467-1472.
[9]	Xiaomeng SHAO, Meng ZHANG. Temporal convolutional knowledge tracing model with attention mechanism [J]. Journal of Computer Applications, 2023, 43(2): 343-348.
[10]	Wenquan LI, Yimin MAO, Xindong PENG. Agglomerative hierarchical clustering algorithm based on hesitant fuzzy set [J]. Journal of Computer Applications, 2023, 43(12): 3755-3763.
[11]	LI Xingjia, YANG Qiuhui, HONG Mei, PAN Chunxia, LIU Ruihang. Test case prioritization approach based on historical data and multi-objective optimization [J]. Journal of Computer Applications, 2023, 43(1): 221-226.
[12]	Jun WU, Aijia OUYANG, Lin ZHANG. Statistically significant sequential patterns mining algorithm under influence degree [J]. Journal of Computer Applications, 2022, 42(9): 2713-2721.
[13]	Shunkun YU, Hongxu YAN. Heuristic attribute value reduction model based on certainty factor [J]. Journal of Computer Applications, 2022, 42(2): 469-474.
[14]	LIU Shize, QIN Yanjun, WANG Chenxing, SU Lin, KE Qixue, LUO Haiyong, SUN Yi, WANG Baohui. Traffic flow prediction algorithm based on deep residual long short-term memory network [J]. Journal of Computer Applications, 2021, 41(6): 1566-1572.
[15]	LI Xujuan, PI Jianyong, HUANG Feixiang, JIA Haipeng. Self-generated deep neural network based 4D trajectory prediction [J]. Journal of Computer Applications, 2021, 41(5): 1492-1499.

Algorithm for mining maximum frequent itemsets based on decreasing dimension of frequent itemset in association rules

关联规则中基于降维的最大频繁模式挖掘算法

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics