Journal of Computer Applications ›› 2025, Vol. 45 ›› Issue (7): 2056-2069.DOI: 10.11772/j.issn.1001-9081.2024070952

• The 39th CCF National Conference of Computer Applications (CCF NCCA 2024) • Previous Articles     Next Articles

Survey of sequential pattern mining

Zhenlong DAI, Meng HAN(), Wenyan YANG, Shineng ZHU, Shurong YANG   

  1. School of Computer Science and Engineering,North Minzu University,Yinchuan Ningxia 750021,China
  • Received:2024-07-09 Revised:2024-09-25 Accepted:2024-10-09 Online:2025-07-10 Published:2025-07-10
  • Contact: Meng HAN
  • About author:DAI Zhenlong, born in 2000, M. S. candidate. His research interests include big data mining.
    YANG Wenyan, born in 1998, M. S. candidate. Her research interests include big data mining.
    ZHU Shineng, born in 1999, M. S. candidate. His research interests include big data mining.
    YANG Shurong, born in 1999, M. S. candidate. Her research interests include big data mining.
  • Supported by:
    National Natural Science Foundation of China(62062004);Natural Science Foundation of Ningxia(2023AAC03315);Innovation Project of North Minzu University(YCX24350)

序列模式挖掘综述

代震龙, 韩萌(), 杨文艳, 朱诗能, 杨书蓉   

  1. 北方民族大学 计算机科学与工程学院,银川 750021
  • 通讯作者: 韩萌
  • 作者简介:代震龙(2000—),男,山西运城人,硕士研究生,CCF会员,主要研究方向:大数据挖掘
    杨文艳(1998—),女(回族),宁夏吴忠人,硕士研究生,CCF会员,主要研究方向:大数据挖掘
    朱诗能(1999—),男,浙江温州人,硕士研究生,CCF会员,主要研究方向:大数据挖掘
    杨书蓉(1999—),女(土家族),湖北宜昌人,硕士研究生,CCF会员,主要研究方向:大数据挖掘。
  • 基金资助:
    国家自然科学基金资助项目(62062004);宁夏自然科学基金资助项目(2023AAC03315);北方民族大学创新项目(YCX24350)

Abstract:

Sequential Pattern Mining (SPM) aims to discover interesting patterns or rules from databases to support and guide user decision-making. In recent years, research on algorithms related to SPM goes deeper and deeper increasingly. With the emergence of large-scale data, many sequential algorithms suitable for parallel environments have been proposed. Therefore, a review of the existing sequential and parallel sequential mining algorithms was presented. Firstly, for sequential pattern serial mining algorithms, structured classification was performed, which means that the algorithms were categorized on the basis of adopted data structures they use, such as tree structure, list structure, and link structure, the advantages and disadvantages of different structures were summarized comprehensively and the strengths and weaknesses of each algorithm were summed up in detail. Secondly, for sequential pattern parallel mining algorithms, for the first time, the existing distributed frameworks were classified according to different characteristics of storage structures, the advantages and disadvantages of different distributed frameworks were analyzed and the parallel algorithms were introduced and analyzed on the basis of these frameworks. Finally, future research directions were discussed to address the shortcomings of the existing SPM algorithms.

Key words: Sequential Pattern Mining (SPM), tree structure, list structure, distributed framework

摘要:

序列模式挖掘(SPM)旨在从数据库中发现有趣的模式或规律,从而为用户决策提供支持与指导。近年来,对SPM相关算法的研究日益深入。随着大规模数据的出现,已经提出许多适用于并行环境的序列算法。因此,对现有的串并行序列挖掘算法进行综述。首先,对于序列模式串行挖掘算法进行结构化的分类,即依据算法采用的数据结构将算法划分为树结构、列表结构和链式结构等,全面总结不同结构的优势与不足,并详细归纳各算法的优缺点;其次,对于序列模式并行挖掘算法,首次根据存储结构的不同特点对现有的分布式框架进行分类,分析不同分布式框架的优缺点,并依据框架对并行算法进行介绍与分析;最后,针对现有SPM算法的不足,讨论下一步的研究方向。

关键词: 序列模式挖掘, 树结构, 列表结构, 分布式框架

CLC Number: