基于项集优化重组的频繁项集发现算法

计算机应用 ›› 2010, Vol. 30 ›› Issue (9): 2332-2334.

基于项集优化重组的频繁项集发现算法

王明¹,宋顺林²

1. 江苏大学
2. 江苏大学计算机学院

收稿日期:2010-03-12 修回日期:2010-05-10 发布日期:2010-09-03 出版日期:2010-09-01
通讯作者: 王明
基金资助:
江苏省产业信息化重点基金项目

Algorithm for discovering frequent item sets based on optimized and regrouped item sets

Received:2010-03-12 Revised:2010-05-10 Online:2010-09-03 Published:2010-09-01

摘要/Abstract

摘要： 发现频繁项集是关联规则挖掘的主要途径，也是关联规则挖掘算法研究的重点。关联规则挖掘的经典Apriori算法及其改进算法大致可以归为基于SQL和基于内存两类。为了提高挖掘效率，在仔细分析了基于内存算法存在效率瓶颈的基础上，提出了一种发现频繁项集的改进算法。该算法使用了一种快速产生和验证候选项集的方法,提高了生成项目集的速度。实验结果显示该算法能有效提高挖掘效率。

关键词: 数据挖掘, 频繁项集, 项集数组, 逻辑运算, 关联规则

Abstract: Discovering frequent item sets is the main way of association rules mining, and it is also the focus of the study in algorithms for association rules mining. The classical Apriori algorithm and its improved algorithms of association rules mining can be generally classified as one based on SQL and the other based on memory. To improve the data-mining efficiency, the authors proposed an efficient algorithm for discovering frequent item sets. After analyzing the efficiency bottlenecks in some algorithms based on memory, the algorithm used a method that could generate and test candidate item sets efficiently to optimize the speed of item sets generation. The experimental results show that the proposed algorithm can assuredly improve the mining efficiency.

Key words: data mining, frequent item set, item sets array, logic operation, association rule

中图分类号:

王明宋顺林. 基于项集优化重组的频繁项集发现算法[J]. 计算机应用, 2010, 30(9): 2332-2334.

[1]	李欢欢, 黄添强, 丁雪梅, 罗海峰, 黄丽清. 基于多尺度时空图卷积网络的交通出行需求预测[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2065-2072.
[2]	董瑶, 付怡雪, 董永峰, 史进, 陈晨. 不完整多视图聚类综述[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1673-1682.
[3]	杨克帅, 武优西, 耿萌, 刘靖宇, 李艳. 一次性条件下top-k高平均效用序列模式挖掘算法[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 477-484.
[4]	郑浩东, 马华, 谢颖超, 唐文胜. 融合遗忘因素与记忆门的图神经网络知识追踪模型[J]. 《计算机应用》唯一官方网站, 2023, 43(9): 2747-2752.
[5]	黄硕, 李艳辉, 曹建秋. 本地化差分隐私下的频繁序列模式挖掘算法PrivSPM[J]. 《计算机应用》唯一官方网站, 2023, 43(7): 2057-2064.
[6]	蒋华, 李星, 王慧娇, 韦静海. 基于数据索引结构的跨级高效用项集挖掘算法[J]. 《计算机应用》唯一官方网站, 2023, 43(7): 2200-2208.
[7]	祁超帅, 何文思, 焦毅, 马英红, 蔡伟, 任素萍. 无人机飞行数据异常检测算法综述[J]. 《计算机应用》唯一官方网站, 2023, 43(6): 1833-1841.
[8]	李元江, 权金升, 谭阳奕, 杨田. 基于相似和差异双视角的高维数据属性约简[J]. 《计算机应用》唯一官方网站, 2023, 43(5): 1467-1472.
[9]	荀亚玲, 王林青, 蔡江辉, 杨海峰. 基于多尺度的时序数据部分周期模式增量挖掘[J]. 《计算机应用》唯一官方网站, 2023, 43(2): 391-397.
[10]	邵小萌, 张猛. 融合注意力机制的时间卷积知识追踪模型[J]. 《计算机应用》唯一官方网站, 2023, 43(2): 343-348.
[11]	李文全, 毛伊敏, 彭新东. 基于犹豫模糊集的凝聚式层次聚类算法[J]. 《计算机应用》唯一官方网站, 2023, 43(12): 3755-3763.
[12]	李兴佳, 杨秋辉, 洪玫, 潘春霞, 刘瑞航. 基于历史数据和多目标优化的测试用例排序方法[J]. 《计算机应用》唯一官方网站, 2023, 43(1): 221-226.
[13]	吴军, 欧阳艾嘉, 张琳. 基于影响度的统计显著序列模式挖掘算法[J]. 《计算机应用》唯一官方网站, 2022, 42(9): 2713-2721.
[14]	余顺坤, 闫泓序. 基于确定性因子的启发式属性值约简模型[J]. 《计算机应用》唯一官方网站, 2022, 42(2): 469-474.
[15]	刘世泽, 秦艳君, 王晨星, 苏琳, 柯其学, 罗海勇, 孙艺, 王宝会. 基于深度残差长短记忆网络交通流量预测算法[J]. 计算机应用, 2021, 41(6): 1566-1572.