一种基于FP-tree的最大频繁项目集挖掘算法

doi:10.3724/SP.J.1087.2005.0998

计算机应用 ›› 2005, Vol. 25 ›› Issue (05): 998-1000.DOI: 10.3724/SP.J.1087.2005.0998

一种基于FP-tree的最大频繁项目集挖掘算法

刘乃丽，李玉忱，马磊

山东大学计算机科学与技术学院

发布日期:2005-05-01 出版日期:2005-05-01

Algorithm for mining maximum frequent itemsets based on FP-tree

LIU Nai-li, LI Yu-chen, MA Lei

School of Computer Science & Technology, Shandong University

Online:2005-05-01 Published:2005-05-01

摘要/Abstract

摘要： 挖掘关联规则是数据挖掘领域中的重要研究内容,其中挖掘最大频繁项目集是挖掘关联规则中的关键问题之一,以前的许多挖掘最大频繁项目集算法是先生成候选,再进行检验,然而候选项目集产生的代价是很高的,尤其是存在大量长模式的时候。文中改进了FP 树结构,提出了一种基于FP tree的快速挖掘最大频繁项目集的算法DMFIA 1,该算法不需要生成最大频繁候选项目集,比DMFIA算法挖掘最大频繁项目集的效率更高。改进的FP 树是单向的,每个结点只保留指向父结点的指针,这大约节省了三分之一的树空间。

关键词: 数据挖掘, 最大频繁项目集, 关联规则, 频繁模式树

Abstract: Mining association rule is an important matter in data mining, in which mining maximum frequent itemsets is a key problem in mining association rule. Many of the previous algorithms mine maximum frequent itemsets by producing candidate itemsets firstly, then pruning. But the cost of producing candidate itemsets is very high, especially when there exist long patterns. In this paper, the structure of a FP-tree was improved, a fast algorithm DMFIA-1 based on FP-tree for mining maximum frequent itemsets was proposed, which did not produce maximum frequent candidate itemsets and was more effective than DMFIA. The new FP-tree is a one-way tree and there is no pointer pointing its children in each node, so at least one third of memory is saved.

Key words: data mining, maximum frequent itemset, association rule, FP-tree

中图分类号:

TP311.13

刘乃丽，李玉忱，马磊. 一种基于FP-tree的最大频繁项目集挖掘算法[J]. 计算机应用, 2005, 25(05): 998-1000.

LIU Nai-li, LI Yu-chen, MA Lei. Algorithm for mining maximum frequent itemsets based on FP-tree[J]. Journal of Computer Applications, 2005, 25(05): 998-1000.

[1]	刘世泽, 秦艳君, 王晨星, 苏琳, 柯其学, 罗海勇, 孙艺, 王宝会. 基于深度残差长短记忆网络交通流量预测算法[J]. 计算机应用, 2021, 41(6): 1566-1572.
[2]	李旭娟, 皮建勇, 黄飞翔, 贾海朋. 基于自生成深度神经网络的4D航迹预测[J]. 计算机应用, 2021, 41(5): 1492-1499.
[3]	陈凯, 于彦伟, 赵金东, 宋鹏. 基于城市交通监控大数据的工作位置推理方法[J]. 计算机应用, 2021, 41(1): 177-184.
[4]	龙洋洋, 陈玉玲, 辛阳, 豆慧. 基于联盟区块链的安全能源交易方案[J]. 计算机应用, 2020, 40(6): 1668-1673.
[5]	杜旭升, 于炯, 叶乐乐, 陈嘉颖. 基于图上随机游走的离群点检测算法[J]. 计算机应用, 2020, 40(5): 1322-1328.
[6]	徐周波, 杨健, 刘华东, 黄文文. 基于XGBoost与拓扑结构信息的蛋白质复合物识别算法[J]. 计算机应用, 2020, 40(5): 1510-1514.
[7]	陈曦, 梅广, 张金金, 许维胜. 融合知识图谱和协同过滤的学生成绩预测方法[J]. 《计算机应用》唯一官方网站, 2020, 40(2): 595-601.
[8]	马董, 陈红梅, 王丽珍, 肖清. 空间亚频繁co-location模式的主导特征挖掘[J]. 《计算机应用》唯一官方网站, 2020, 40(2): 465-472.
[9]	李莎莎, 梁冬阳, 余杰, 纪斌, 马俊, 谭郁松, 吴庆波. 基于师门关系的研究团队挖掘算法[J]. 计算机应用, 2020, 40(11): 3198-3202.
[10]	孙鹤立, 张优优, 杨洲, 何亮, 贾晓琳. 基于时间线段树的城市可达区域搜索[J]. 计算机应用, 2020, 40(10): 2936-2941.
[11]	王淳颖, 张驯, 赵金雄, 袁晖, 李方军, 赵博, 朱小琴, 杨凡, 吕世超. 基于多源告警的攻击事件分析[J]. 计算机应用, 2020, 40(1): 123-128.
[12]	李博, 张晓, 颜靖艺, 李可威, 李恒, 凌玉龙, 张勇. 基于值差度量和聚类优化的K最近邻算法在银行客户行为预测中的应用[J]. 计算机应用, 2019, 39(9): 2784-2788.
[13]	纪丽娜, 陈凯, 于彦伟, 宋鹏, 王淑莹, 王成锐. 基于城市交通大数据的车辆类别挖掘及应用分析[J]. 计算机应用, 2019, 39(5): 1343-1350.
[14]	于永斌, 戚敏惠, 尼玛扎西, 王琳. 基于阈值自适应忆阻器Hopfield神经网络的关联规则挖掘算法[J]. 计算机应用, 2019, 39(3): 728-733.
[15]	叶志宇, 冯爱民, 高航. 基于深度LightGBM集成学习模型的谷歌商店顾客购买力预测[J]. 计算机应用, 2019, 39(12): 3434-3439.

一种基于FP-tree的最大频繁项目集挖掘算法

Algorithm for mining maximum frequent itemsets based on FP-tree

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics