《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (1): 94-108.DOI: 10.11772/j.issn.1001-9081.2021071290

• 数据科学与技术 • 上一篇    下一篇

动态数据上的高效用模式挖掘综述

单芝慧, 韩萌(), 韩强   

  1. 北方民族大学 计算机科学与工程学院,银川 750021
  • 收稿日期:2021-07-19 修回日期:2021-08-16 接受日期:2021-08-23 发布日期:2021-08-16 出版日期:2022-01-10
  • 通讯作者: 韩萌
  • 作者简介:单芝慧(1996—),女,河南周口人,硕士研究生,CCF会员,主要研究方向:模式挖掘
    韩萌(1982—),女,河南商丘人,副教授,博士,CCF会员,主要研究方向:数据挖掘
    韩强(1973—),男,黑龙江阿城人,教授,博士,CCF会员,主要研究方向:工作流、可信软件。
  • 基金资助:
    国家自然科学基金资助项目(62062004);宁夏自然科学基金资助项目(2020AAC03216)

Survey of high utility pattern mining on dynamic data

Zhihui SHAN, Meng HAN(), Qiang HAN   

  1. School of Computer Science and Engineering,North Minzu University,Yinchuan Ningxia 750021,China
  • Received:2021-07-19 Revised:2021-08-16 Accepted:2021-08-23 Online:2021-08-16 Published:2022-01-10
  • Contact: Meng HAN
  • About author:SHAN Zhihui, born in 1996, M. S. candidate. Her research interests include pattern mining.
    HAN Meng, born in 1982, Ph. D., associate professor. Her research interests include data mining.
    HAN Qiang, born in 1973, Ph. D., professor. His research interests include workflow, trusted software.
  • Supported by:
    National Natural Science Foundation of China(62062004);Natural Science Foundation of Ningxia(2020AAC03216)

摘要:

高效用模式挖掘(HUPM)考虑了项的购买数量及单位利润,提供了项更详细的信息,使用户能够做出更好的经济决策。针对大多数HUPM算法都应用在与不断产生数据的现实世界不符的静态数据集上的问题,近些年不断提出了动态数据上的HUPM算法。首先,对增量数据、数据流、动态删除和动态修改数据上的HUPM算法以及融合高效用模式(高效用序列模式、平均高效用模式、top-k高效用模式等)挖掘算法进行了总结;然后,对使用不同类型数据的算法进行了总结,包括动态利润数据、动态序列数据等数据类型;其次,从算法使用的数据结构、剪枝策略、窗口模型、优缺点等角度对HUPM算法进行分类总结;最后,针对目前研究的不足,提出了下一步动态数据上的HUPM算法研究方向。

关键词: 高效用模式, 增量数据, 数据流, 动态删除, 动态修改, 动态数据

Abstract:

High Utility Pattern Mining (HUPM) provides details about items to let users make better economic decisions by considering the numbers of purchase and the unit profits of items. Since most HUPM algorithms are applied in static databases, which are inconsistent with real-world scenarios where data is constantly generated, HUIM algorithms on dynamic data have been proposed in recent years. Firstly, the HUPM algorithms on incremental data, data stream, dynamic deletion data and dynamic modification data as well as the integrated high utility patterns (such as high utility sequential patterns, average high utility patterns, and top-k high utility patterns) mining algorithms were summarized. Secondly, the algorithms that handled different types of data, including dynamic profit data, dynamic sequence data and other data types, were summed up. Thirdly, the HUPM algorithms were classified and summarized from the perspectives of data structure, pruning strategy, window model, advantages and disadvantages. Finally, aiming at the lack in the current research, the research directions of HUPM algorithm on dynamic data in the next step were proposed.

Key words: high utility pattern, incremental data, data stream, dynamic deletion, dynamic modification, dynamic data

中图分类号: