Journal of Computer Applications ›› 2022, Vol. 42 ›› Issue (4): 1137-1147.DOI: 10.11772/j.issn.1001-9081.2021071259

• The 36 CCF National Conference of Computer Applications (CCF NCCA 2020) • Previous Articles    

Ensemble classification algorithm based on dynamic weighting function

Le WANG, Meng HAN(), Xiaojuan LI, Ni ZHANG, Haodong CHENG   

  1. School of Computer Science and Engineering,North Minzu University,Yinchuan Ningxia 750021,China
  • Received:2021-07-16 Revised:2021-08-16 Accepted:2021-08-25 Online:2021-08-16 Published:2022-04-10
  • Contact: Meng HAN
  • About author:WANG Le, born in 1994, M. S. candidate. Her research interests include data mining, data stream ensemble classification.
    LI Xiaojuan, born in 1994, M. S. candidate. Her research interests include data mining, data stream ensemble classification.
    ZHANG Ni, born in 1996, M. S. candidate. Her research interests include data mining, high utility pattern mining.
    CHENG Haodong, born in 1996, M. S. candidate. His research interests include data mining, high utility pattern mining.
  • Supported by:
    National Natural Science Foundation of China(62062004);Ningxia Natural Science Foundation(2020AAC03216)

基于动态加权函数的集成分类算法

王乐, 韩萌(), 李小娟, 张妮, 程浩东   

  1. 北方民族大学 计算机科学与工程学院,银川 750021
  • 通讯作者: 韩萌
  • 作者简介:王乐(1994—),女,吉林白城人,硕士研究生,CCF会员,主要研究方向:数据挖掘、数据流集成分类
    李小娟(1994—),女(回族),宁夏吴忠人,硕士研究生,CCF会员,主要研究方向:数据挖掘、数据流集成分类
    张妮(1996—),女,山西长治人,硕士研究生,CCF会员,主要研究方向:数据挖掘、高效用模式挖掘
    程浩东(1996—),男,山东泰安人,硕士研究生,CCF会员,主要研究方向:数据挖掘、高效用模式挖掘。
  • 基金资助:
    国家自然科学基金资助项目(62062004);宁夏自然科学基金资助项目(2020AAC03216)

Abstract:

In data stream ensemble classification, to make the classifiers adapt to the constantly changing data stream and adjust the weights of base classifiers to select an appropriate set of classifiers, an ensemble classification algorithm based on dynamic weighting function was proposed. Firstly, a new weighting function was proposed to adjust the weights of the base classifiers, and the classifiers were trained with constantly updated data blocks. Then a weight function was used to make a reasonable selection of candidate classifiers. Finally, the incremental nature of decision tree was applied to the base classifiers, and the classification of data stream was realized. Through a large amount of experiments, it is found that the performance of the proposed algorithm is not affected by block size. Compared with AUE2 algorithm, the average number of leaves is reduced by 681.3, the average number of nodes is reduced by 1 192.8, and the average depth of the tree is reduced by 4.42. At the same time, the accuracy is relatively improved and the time-consuming is reduced. Experimental results show that the algorithm can not only guarantee the accuracy but also save a lot of memory and time when classifying data stream.

Key words: data stream, ensemble classification, dynamic weighting, block, incremental learning

摘要:

针对数据流集成分类如何使分类器适应不断变化的数据流,调整基分类器的权重选择合适的分类器集合的问题,提出了一种基于动态加权函数的集成分类算法。首先,提出了一种加权函数调节基分类器的权重,使用不断更新的数据块训练分类器;然后,使用一个新的权重函数对候选分类器进行一个合理的选择;最后,在基分类器中应用决策树的增量性质,实现对数据流的分类。通过大量实验发现,基于动态加权函数的集成分类算法的性能不受块的大小影响,与AUE2算法相比,叶子数平均减少了681.3、节点数平均减少了1 192.8,树的深度平均减少了4.42,同时相对地提高了准确率,降低了消耗时间。实验结果表明该算法在对数据流进行分类时不但可以保证准确率还可以节省大量的内存空间和时间。

关键词: 数据流, 集成分类, 动态加权, 块, 增量学习

CLC Number: