• •    

最大模糊频繁模式挖掘算法研究

张海清1,李代伟1,刘胤田1,龚程1,于曦2   

  1. 1. 成都信息工程大学
    2. 成都大学 信息科学与技术学院
  • 收稿日期:2016-10-08 修回日期:2016-12-07 发布日期:2016-12-07
  • 通讯作者: 于曦

Research on Ming Algorithm of Maximal Fuzzy Frequent Pattern

  • Received:2016-10-08 Revised:2016-12-07 Online:2016-12-07

摘要: 摘 要: 高效的挖掘有用的但潜藏的信息并且通过恰当的结构表达该信息对于高级模式挖掘的理论和应用都有重要的意义。但有效模式挖掘最根本的挑战是模式挖掘的组合爆炸以及挖掘结果信息的有效表达。目前已有的多数研究尚未完全能够解决这些问题的根本原因在于巨大数量候选模式的生成和项目的权重仅考虑其确定数值。在本文研究中,为了解决以上问题,创新性的提出了基于“核心-牵引”这一模式的模糊模式结构,综合分析了项目的模糊性,提出了模糊支持度,项目在事物数据集中的模糊权重,并且依据模糊修剪策略提出了最大模糊模式挖掘树(MFFP-Tree)。因此,所提出的最大模糊模式挖掘具有高效的性能,具体体现在:算法仅扫描数据集一次;模糊剪枝策略的提出减少了模式提取的开销;并且基于模糊加权来增强挖掘结果的可靠性。根据基准数据集的广泛的实验结果,表明最大模糊模式挖掘算法相比PADS和FPMax*算法在时空复杂度和挖掘结果的有效性方面都具有显著的性能。

关键词: 高级模式挖掘, 最大模糊模式, 模糊支持度, 核心-牵引模式结构, 模糊修剪策略

Abstract: Abstract: The efficient mining for potentially useful but hidden information from large datasets and the representation of the embedded information by using proper structure are important in advanced pattern mining. Combinatorial explosion and the effectiveness of mining results are the essential challenges of meaningful pattern extraction. The existing algorithms cannot entirely solve these issues is because that a huge amount of candidate subsets have been generated and the items have the features of uncertainty. In this paper, we have proposed core-(second-order-effect) pattern structure, FFP-Tree structure, and fuzzy constraints to ensure the mining efficient of extracted patterns. The proposed maximal FFPs mining algorithm only need to scan the dataset only once and it also can reduce pattern generation based on fuzzy pruning strategy. The experimental results gained from the benchmark datasets analysis reveal that the proposed maximal FFPs algorithms has outstanding performance by comparing with PADS and FPMax* algorithms.

Key words: Advanced pattern mining, Maximum fuzzy pattern, Fuzzy support, core-(second-order-effect) pattern structure, Fuzzy pruning strategy