Journal of Computer Applications

    Next Articles

Survey of sequential pattern mining

  

  • Received:2024-07-09 Revised:2024-09-06 Online:2024-11-19 Published:2024-11-19

序列模式挖掘综述

代震龙,韩萌,杨文艳,朱诗能,杨书蓉   

  1. 北方民族大学
  • 通讯作者: 韩萌
  • 基金资助:
    国家自然科学基金资助项目;宁夏自然科学基金项目资助项目;北方民族大学创新项目

Abstract: Sequential pattern mining aims to discover interesting patterns or rules from databases to support and guide user decision-making. In recent years, research on algorithms related to sequential pattern mining has been increasingly in-depth. With the emergence of large-scale data, many sequential algorithms suitable for parallel environments have been proposed. A review of existing sequential and parallel sequential mining algorithms is presented. In sequential pattern serial mining algorithms, algorithms are categorized based on the data structures they use, such as tree structures, list structures, and linked structures, comprehensively summarizing the advantages and disadvantages of different structures and detailing the strengths and weaknesses of each algorithm. In sequential pattern parallel mining algorithms, for the first time, the existing distributed frameworks are classified according to different characteristics of storage structures, analyzing the advantages and disadvantages of different distributed frameworks and introducing and analyzing parallel algorithms based on these frameworks. Finally, future research directions are proposed to address the shortcomings of existing sequential pattern mining methods.

Key words: survey, sequential pattern mining, tree structure, list structure, distributed framework

摘要: 序列模式挖掘旨在从数据库中发现有趣的模式或规律,为用户决策提供支持和指导。近年来对序列模式挖掘相关算法的研究日益深入,随着大规模数据的出现,已经提出了许多适用于并行环境的序列算法。对现有的串并行序列挖掘算法进行了综述,在序列模式串行挖掘算法中,对算法进行结构化的分类,依据算法采用的数据结构将算法划分为树结构、列表结构以及链式结构等,全面总结了不同结构的优势与不足,并详细归纳了各算法的优缺点。在序列模式并行挖掘算法中,首次根据存储结构的不同特点对现有的分布式框架进行分类,分析了不同分布式框架的优缺点,并依据框架对并行算法进行了介绍与分析。最后,针对现有序列模式挖掘的不足,提出了下一步的研究工作。

关键词: 综述, 序列模式挖掘, 树结构, 列表结构, 分布式框架

CLC Number: