Journal of Computer Applications ›› 2026, Vol. 46 ›› Issue (3): 775-780.DOI: 10.11772/j.issn.1001-9081.2025030305
• Data science and technology • Previous Articles Next Articles
Hao LI1, Lei WANG2, Le SUN2, Youxi WU1(
)
Received:2025-03-25
Revised:2025-06-23
Accepted:2025-06-25
Online:2025-07-07
Published:2026-03-10
Contact:
Youxi WU
About author:LI Hao, born in 2000, M. S. candidate. His research interests include data mining.Supported by:通讯作者:
武优西
作者简介:李昊(2000—),男,河北张家口人,硕士研究生,主要研究方向:数据挖掘基金资助:CLC Number:
Hao LI, Lei WANG, Le SUN, Youxi WU. Rare sequential pattern mining method with adaptive gap under one-off condition[J]. Journal of Computer Applications, 2026, 46(3): 775-780.
李昊, 王磊, 孙乐, 武优西. 一次性条件下自适应间隙稀有序列模式挖掘方法[J]. 《计算机应用》唯一官方网站, 2026, 46(3): 775-780.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2025030305
| 方法 | 功能 | 间隙类型 | 支持度计算 |
|---|---|---|---|
| 文献[ | 最大模式挖掘 | 间隙约束 | 可重复 |
| 文献[ | 闭合模式挖掘 | 自适应间隙 | 判别方式 |
| 文献[ | 稀有模式挖掘 | 自适应间隙 | 判别方式 |
| 文献[ | 稀有模式挖掘 | 自适应间隙 | 判别方式 |
| 文献[ | 高效用模式挖掘 | 自适应间隙 | 可重复 |
| 本文方法 | 稀有模式挖掘 | 自适应间隙 | 可重复 |
Tab. 1 Comparison of different pattern mining methods
| 方法 | 功能 | 间隙类型 | 支持度计算 |
|---|---|---|---|
| 文献[ | 最大模式挖掘 | 间隙约束 | 可重复 |
| 文献[ | 闭合模式挖掘 | 自适应间隙 | 判别方式 |
| 文献[ | 稀有模式挖掘 | 自适应间隙 | 判别方式 |
| 文献[ | 稀有模式挖掘 | 自适应间隙 | 判别方式 |
| 文献[ | 高效用模式挖掘 | 自适应间隙 | 可重复 |
| 本文方法 | 稀有模式挖掘 | 自适应间隙 | 可重复 |
| 数据集 | 来源 | 事件数 | 序列数 | 序列长度总和 |
|---|---|---|---|---|
| SDB1 | Air Quality | 8 | 1 460 | 70 099 |
| SDB2 | Location | 13 | 35 | 5 549 |
| SDB3 | MSNBC | 17 | 990 | 12 193 |
| SDB4 | Anticancer Peptides | 20 | 151 | 5 467 |
| SDB5 | Skin | 9 | 6 545 | 58 904 |
Tab. 2 Information of experimental datasets
| 数据集 | 来源 | 事件数 | 序列数 | 序列长度总和 |
|---|---|---|---|---|
| SDB1 | Air Quality | 8 | 1 460 | 70 099 |
| SDB2 | Location | 13 | 35 | 5 549 |
| SDB3 | MSNBC | 17 | 990 | 12 193 |
| SDB4 | Anticancer Peptides | 20 | 151 | 5 467 |
| SDB5 | Skin | 9 | 6 545 | 58 904 |
| 算法 | 倒排索引 | 剪枝策略 | 模式融合 |
|---|---|---|---|
| PMBC-ORP | ✕ | √ | √ |
| HAOP-ORP | ✕ | √ | √ |
| ORP-NoPrune | √ | ✕ | √ |
| ORP-B | √ | √ | ✕ |
| ORP-D | √ | √ | ✕ |
| 本文算法 | √ | √ | √ |
Tab. 3 Verification relationships in ablation experiments
| 算法 | 倒排索引 | 剪枝策略 | 模式融合 |
|---|---|---|---|
| PMBC-ORP | ✕ | √ | √ |
| HAOP-ORP | ✕ | √ | √ |
| ORP-NoPrune | √ | ✕ | √ |
| ORP-B | √ | √ | ✕ |
| ORP-D | √ | √ | ✕ |
| 本文算法 | √ | √ | √ |
| 算法 | 运行时间/s | 候选模式数 | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| SDB1 | SDB2 | SDB3 | SDB4 | SDB5 | SDB1 | SDB2 | SDB3 | SDB4 | SDB5 | |
| PMBC-ORP[ | 11.14 | 23.28 | 11.34 | 8.58 | 8.17 | 44 | 1 550 | 53 | 227 | 22 |
| HAOP-ORP[ | 4.14 | 6.18 | 4.49 | 3.08 | 5.07 | 44 | 1 550 | 53 | 227 | 22 |
| ORP-NoPrune | 23.41 | 5.92 | 1 744.28 | 705.86 | 79.81 | 584 | 2 379 | 88 740 | 168 420 | 819 |
| ORP-B | 3.85 | 5.13 | 11.36 | 4.79 | 12.75 | 72 | 2 106 | 700 | 990 | 126 |
| ORP-D | 5.14 | 6.84 | 12.03 | 5.72 | 14.03 | 72 | 2 106 | 700 | 990 | 126 |
| ORP | 3.25 | 4.45 | 4.11 | 2.86 | 5.03 | 44 | 1 550 | 53 | 227 | 22 |
Tab. 4 Comparison of running time and number of candidate patterns among different algorithms
| 算法 | 运行时间/s | 候选模式数 | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| SDB1 | SDB2 | SDB3 | SDB4 | SDB5 | SDB1 | SDB2 | SDB3 | SDB4 | SDB5 | |
| PMBC-ORP[ | 11.14 | 23.28 | 11.34 | 8.58 | 8.17 | 44 | 1 550 | 53 | 227 | 22 |
| HAOP-ORP[ | 4.14 | 6.18 | 4.49 | 3.08 | 5.07 | 44 | 1 550 | 53 | 227 | 22 |
| ORP-NoPrune | 23.41 | 5.92 | 1 744.28 | 705.86 | 79.81 | 584 | 2 379 | 88 740 | 168 420 | 819 |
| ORP-B | 3.85 | 5.13 | 11.36 | 4.79 | 12.75 | 72 | 2 106 | 700 | 990 | 126 |
| ORP-D | 5.14 | 6.84 | 12.03 | 5.72 | 14.03 | 72 | 2 106 | 700 | 990 | 126 |
| ORP | 3.25 | 4.45 | 4.11 | 2.86 | 5.03 | 44 | 1 550 | 53 | 227 | 22 |
| 算法 | 运行时间/s | 候选模式数 | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| α=24 | α=22 | α=20 | α=18 | α=16 | α=24 | α=22 | α=20 | α=18 | α=16 | |
| PMBC-ORP | 8.52 | 11.14 | 14.86 | 14.91 | 16.68 | 41 | 44 | 47 | 50 | 56 |
| HAOP-ORP | 3.38 | 4.14 | 4.63 | 4.69 | 5.28 | 41 | 44 | 47 | 50 | 56 |
| ORP-NoPrune | 23.29 | 23.41 | 23.34 | 23.43 | 23.23 | 584 | 584 | 584 | 584 | 584 |
| ORP-B | 2.91 | 3.85 | 6.61 | 6.88 | 8.59 | 60 | 72 | 120 | 120 | 138 |
| ORSP-D | 4.09 | 5.14 | 7.93 | 7.98 | 9.80 | 60 | 72 | 120 | 120 | 138 |
| ORP | 2.16 | 3.25 | 4.20 | 4.32 | 5.12 | 41 | 44 | 47 | 50 | 56 |
Tab. 5 Comparison of running time and number of candidate patterns among various algorithms under different α values
| 算法 | 运行时间/s | 候选模式数 | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| α=24 | α=22 | α=20 | α=18 | α=16 | α=24 | α=22 | α=20 | α=18 | α=16 | |
| PMBC-ORP | 8.52 | 11.14 | 14.86 | 14.91 | 16.68 | 41 | 44 | 47 | 50 | 56 |
| HAOP-ORP | 3.38 | 4.14 | 4.63 | 4.69 | 5.28 | 41 | 44 | 47 | 50 | 56 |
| ORP-NoPrune | 23.29 | 23.41 | 23.34 | 23.43 | 23.23 | 584 | 584 | 584 | 584 | 584 |
| ORP-B | 2.91 | 3.85 | 6.61 | 6.88 | 8.59 | 60 | 72 | 120 | 120 | 138 |
| ORSP-D | 4.09 | 5.14 | 7.93 | 7.98 | 9.80 | 60 | 72 | 120 | 120 | 138 |
| ORP | 2.16 | 3.25 | 4.20 | 4.32 | 5.12 | 41 | 44 | 47 | 50 | 56 |
| 算法 | 运行时间/s | 候选模式数 | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| β=60 | β=80 | β=100 | β=120 | β=140 | β=60 | β=80 | β=100 | β=120 | β=140 | |
| PMBC-ORP | 8.56 | 8.58 | 8.69 | 8.57 | 8.48 | 227 | 227 | 227 | 227 | 227 |
| HAOP-ORP | 3.01 | 3.08 | 3.07 | 3.21 | 3.39 | 227 | 227 | 227 | 227 | 227 |
| ORP-NoPrune | 657.65 | 705.86 | 670.17 | 703.92 | 668.18 | 168 420 | 168 420 | 168 420 | 168 420 | 168 420 |
| ORP-B | 4.61 | 4.79 | 5.03 | 5.14 | 5.18 | 990 | 990 | 990 | 990 | 990 |
| ORSP-D | 5.62 | 5.72 | 5.88 | 5.90 | 5.94 | 990 | 990 | 990 | 990 | 990 |
| ORP | 2.75 | 2.86 | 2.98 | 3.09 | 3.26 | 227 | 227 | 227 | 227 | 227 |
Tab. 6 Comparison of running time and number of candidate patterns among various algorithms under different β values
| 算法 | 运行时间/s | 候选模式数 | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| β=60 | β=80 | β=100 | β=120 | β=140 | β=60 | β=80 | β=100 | β=120 | β=140 | |
| PMBC-ORP | 8.56 | 8.58 | 8.69 | 8.57 | 8.48 | 227 | 227 | 227 | 227 | 227 |
| HAOP-ORP | 3.01 | 3.08 | 3.07 | 3.21 | 3.39 | 227 | 227 | 227 | 227 | 227 |
| ORP-NoPrune | 657.65 | 705.86 | 670.17 | 703.92 | 668.18 | 168 420 | 168 420 | 168 420 | 168 420 | 168 420 |
| ORP-B | 4.61 | 4.79 | 5.03 | 5.14 | 5.18 | 990 | 990 | 990 | 990 | 990 |
| ORSP-D | 5.62 | 5.72 | 5.88 | 5.90 | 5.94 | 990 | 990 | 990 | 990 | 990 |
| ORP | 2.75 | 2.86 | 2.98 | 3.09 | 3.26 | 227 | 227 | 227 | 227 | 227 |
| [1] | LIU Y, WANG L, YANG P, et al. Mining interpretable regional co-location patterns based on urban functional region division [J]. Data Science and Engineering, 2024, 9(4): 464-485. |
| [2] | 代震龙,韩萌,杨文艳,等. 序列模式挖掘综述[J]. 计算机应用, 2025, 45(7): 2056-2069. |
| DAI Z L, HAN M, YANG W Y, et al. Survey of sequential pattern mining [J]. Journal of Computer Applications, 2025, 45(7): 2056-2069. | |
| [3] | LI Y, MA C, GAO R, et al. OPF-Miner: order-preserving pattern mining with forgetting mechanism for time series [J]. IEEE Transactions on Knowledge and Data Engineering, 2024, 36(12): 8981-8995. |
| [4] | YAN H, LI F, HSIEH M C, et al. High-utility sequential pattern mining in incremental database [J]. The Journal of Supercomputing, 2025, 81: No.81. |
| [5] | DONG X, QIU P, LU J, et al. Mining top-k useful negative sequential patterns via learning [J]. IEEE Transactions on Neural Networks and Learning Systems, 2019, 30(9): 2764-2778. |
| [6] | MIN F, ZHANG Z H, ZHAI W J, et al. Frequent pattern discovery with tri-partition alphabets [J]. Information Sciences, 2020, 507: 715-732. |
| [7] | 孟玉飞,武优西,王珍,等. 对比保序模式挖掘算法[J]. 计算机应用, 2023, 43(12): 3740-3746. |
| MENG Y F, WU Y X, WANG Z, et al. Contrast order-preserving pattern mining algorithm [J]. Journal of Computer Applications, 2023, 43(12): 3740-3746. | |
| [8] | FOURNIER-VIGER P, GAN W, WU Y, et al. Pattern mining: current challenges and opportunities [C]// Proceedings of the 2022 International Conference on Database Systems for Advanced Applications, LNCS 13248. Cham: Springer, 2022: 34-49. |
| [9] | GENG M, WU Y, LI Y, et al. RNP-Miner: repetitive nonoverlapping sequential pattern mining [J]. IEEE Transactions on Knowledge and Data Engineering, 2024, 36(9): 4874-4889. |
| [10] | GAN W, LIN J C W, FOURNIER-VIGER P, et al. A survey of parallel sequential pattern mining [J]. ACM Transactions on Knowledge Discovery from Data, 2019, 13(3): No.25. |
| [11] | SUN C, GONG Y, GUO Y, et al. SN-RNSP: mining self-adaptive nonoverlapping repetitive negative sequential patterns in transaction sequences [J]. Knowledge-Based Systems, 2024, 287: No.111449. |
| [12] | WU Y, WANG X, LI Y, et al. OWSP-Miner: self-adaptive one-off weak-gap strong pattern mining [J]. ACM Transactions on Management Information Systems, 2022, 13(3): No.25. |
| [13] | CAI S, CHEN J, CHEN H, et al. An efficient anomaly detection method for uncertain data based on minimal rare patterns with the consideration of anti-monotonic constraints [J]. Information Sciences, 2021, 580: 620-642. |
| [14] | WU X, ZHU X, HE Y, et al. PMBC: pattern mining from biological sequences with wildcard constraints [J]. Computers in Biology and Medicine, 2013, 43(5): 481-492. |
| [15] | 周忠玉,皮德常. 面向购物篮数据的稀有序列模式挖掘算法[J]. 小型微型计算机系统, 2019, 40(3): 683-688. |
| ZHOU Z Y, PI D C. Rare sequence pattern mining algorithm for shopping basket data [J]. Journal of Chinese Computer Systems, 2019, 40(3): 683-688. | |
| [16] | JAYSAWAL B P, HUANG J W. PSP-AMS: progressive mining of sequential patterns across multiple streams[J]. ACM Transactions on Knowledge Discovery from Data, 2019, 13(1): No.5. |
| [17] | DONG X, GONG Y, CAO L. e-RNSP: An efficient method for mining repetition negative sequential patterns [J]. IEEE Transactions on Cybernetics, 2020, 50(5): 2084-2096. |
| [18] | LI Y, WANG Z, LIU J, et al. Mining repetitive negative sequential patterns with gap constraints [J]. ACM Transactions on Knowledge Discovery from Data, 2025, 19(4): No.86. |
| [19] | 杨鸿茜,武优西,耿萌,等. 高效的一次性弱间隙序列模式挖掘算法[J]. 计算机工程, 2024, 50(3): 60-67. |
| YANG H X, WU Y X, GENG M, et al. Efficient one-off weak gap sequential pattern mining algorithm [J]. Computer Engineering, 2024, 50(3): 60-67. | |
| [20] | SALETI S, SUBRAMANYAM R B V. A MapReduce solution for incremental mining of sequential patterns from big data [J]. Expert Systems with Applications, 2019, 133: 109-125. |
| [21] | HUANG G, GAN W, YU P S. TaSPM: targeted sequential pattern mining [J]. ACM Transactions on Knowledge Discovery from Data, 2024, 18(5): No.114. |
| [22] | GAN W, LIN J C W, ZHANG J, et al. Fast utility mining on sequence data [J]. IEEE Transactions on Cybernetics, 2021, 51(2): 487-500. |
| [23] | LI G, XIANG J, FANG W, et al. HUPSP-LAL: efficiently mining utility-driven sequential patterns in uncertain sequences[J]. Expert Systems with Applications, 2025, 270: No.126536. |
| [24] | WAN X, HAN X. Efficient top-k frequent itemset mining on massive data [J]. Data Science and Engineering, 2024, 9(2): 177-203. |
| [25] | LI Y, ZHANG S, GUO L, et al. NetNMSP: nonoverlapping maximal sequential pattern mining [J]. Applied Intelligence, 2022, 52(9): 9861-9884. |
| [26] | 张洪泽,洪征,王辰,等. 基于闭合序列模式挖掘的未知协议格式推断方法[J]. 计算机科学, 2019, 46(6): 80-89. |
| ZHANG H Z, HONG Z, WANG C, et al. Closed sequential patterns mining based unknown protocol format inference method [J]. Computer Science, 2019, 46(6): 80-89. | |
| [27] | 雷雨,李曼,胡卫松,等. 高效的稀有序列模式挖掘方法[J]. 计算机科学与探索, 2015, 9(4): 429-437. |
| LEI Y, LI M, HU W S, et al. Efficient methods for rare sequential pattern mining [J]. Journal of Frontiers of Computer Science and Technology, 2015, 9(4): 429-437. | |
| [28] | ZHANG P, CHEN J, WAN S, et al. Targeted mining of rare high-utility patterns [C]// Proceedings of the 2022 IEEE International Conference on Big Data. Piscataway: IEEE, 2022: 6271-6280. |
| [29] | WANG T, DUAN L, DONG G, et al. Efficient mining of outlying sequence patterns for analyzing outlierness of sequence data [J]. ACM Transactions on Knowledge Discovery from Data, 2020, 14(5): No.62. |
| [30] | WU Y, LEI R, LI Y, et al. HAOP-Miner: self-adaptive high-average utility one-off sequential pattern mining [J]. Expert Systems with Applications, 2021, 184: No.115449. |
| [1] | Keshuai YANG, Youxi WU, Meng GENG, Jingyu LIU, Yan LI. Top-k high average utility sequential pattern mining algorithm under one-off condition [J]. Journal of Computer Applications, 2024, 44(2): 477-484. |
| [2] | Xiaoyu ZHANG, Ziqiang YU, Chengdong LIU, Bohan LI, Changfeng JING. Spatial-temporal co-occurrence pattern mining algorithm for video data [J]. Journal of Computer Applications, 2023, 43(8): 2330-2337. |
| [3] | ZHANG Haiqing, LI Daiwei, LIU Yintian, GONG Cheng, YU Xi. Mining algorithm of maximal fuzzy frequent patterns [J]. Journal of Computer Applications, 2017, 37(5): 1424-1429. |
| [4] | LIU Huiting, SHEN Shengxia, ZHAO Peng, YAO Sheng. Frequent closed itemset mining algorithm over uncertain data [J]. Journal of Computer Applications, 2015, 35(10): 2911-2914. |
| [5] | WANG Huadong YANG Jie LI Yajuan. Mining multiple sequential patterns with gap constraints [J]. Journal of Computer Applications, 2014, 34(9): 2612-2616. |
| [6] | HUANG Guolin GUO Dan HU Xuegang. Algorithms for approximate pattern matching with wildcards and length constraints [J]. Journal of Computer Applications, 2013, 33(03): 800-805. |
| [7] | Ma Li-sheng YAO Guang-shun YANG Chuan-jian. Mining algorithm for maximal frequent itemsets based on improved FP-tree [J]. Journal of Computer Applications, 2012, 32(02): 326-329. |
| Viewed | ||||||
|
Full text |
|
|||||
|
Abstract |
|
|||||