《计算机应用》唯一官方网站 ›› 2021, Vol. 41 ›› Issue (12): 3475-3479.DOI: 10.11772/j.issn.1001-9081.2021060898

• 第十八届中国机器学习会议(CCML 2021) • 上一篇    

基于频繁项挖掘的贝叶斯网络结构学习算法BNSL-FIM

李昡熠1,2, 周鋆1,2()   

  1. 1.国防科技大学 系统工程学院,长沙 410073
    2.国防科技大学 信息系统工程重点实验室,长沙 410073
  • 收稿日期:2021-05-12 修回日期:2021-06-08 接受日期:2021-06-23 发布日期:2021-08-20 出版日期:2021-12-10
  • 通讯作者: 周鋆
  • 作者简介:李昡熠(1993—),女,江西婺源人,硕士研究生,主要研究方向:机器学习、贝叶斯网络;
  • 基金资助:
    国家自然科学基金资助项目(61703416);长沙市杰出创新青年培养计划项目(KQ2009009)

BNSL-FIM: Bayesian network structure learning algorithm based on frequent item mining

Xuanyi LI1,2, Yun ZHOU1,2()   

  1. 1.College of Systems Engineering,National University of Defense Technology,Changsha Hunan 410003,China
    2.Science and Technology on Information Systems Engineering Laboratory,National University of Defense Technology,Changsha Hunan 410073,China
  • Received:2021-05-12 Revised:2021-06-08 Accepted:2021-06-23 Online:2021-08-20 Published:2021-12-10
  • Contact: Yun ZHOU
  • About author:LI Xuanyi, born in 1993, M. S. candidate. Her research interests include machine learning, Bayesian network.
  • Supported by:
    the National Natural Science Foundation of China(61703416);the Training Program for Excellent Young Innovators of Changsha(KQ2009009)

摘要:

贝叶斯网络能够表示不确定知识并进行推理计算表达,但由于实际样本数据存在噪声和大小限制以及网络空间搜索的复杂性,贝叶斯网络结构学习始终会存在一定的误差。为了提高贝叶斯网络结构学习的准确度,提出了以最大频繁项集和关联规则分析结果为先验知识的贝叶斯网络结构学习算法BNSL-FIM 。首先从数据中挖掘出最大频繁项集并对该项集进行结构学习,之后使用关联规则分析结果对其进行校正,从而确定基于频繁项挖掘和关联规则分析的先验知识。然后提出一种融合先验知识的BDeu评分算法进行贝叶斯网络结构学习。最后在6个公开标准的数据集上开展了实验,并对比引入先验/不引入先验的结构与原始网络结构的汉明距离,结果表明所提算法与未引入先验的BDeu评分算法相比显著提高了贝叶斯网络结构学习的准确度。

关键词: 贝叶斯网络, 结构学习, 关联规则分析, Apriori算法, BDeu评分

Abstract:

Bayesian networks can represent uncertain knowledge and perform inferential computational expressions, but due to the noise and size limitations of actual sample data and the complexity of network space search, Bayesian network structure learning will always have certain errors. To improve the accuracy of Bayesian network structure learning, a Bayesian network structure learning algorithm with the results of maximum frequent itemset and association rule analysis as the prior knowledge was proposed, namely BNSL-FIM (Bayesian Network Structure Learning algorithm based on Frequent Item Mining). Firstly, the maximum frequent itemset was mined from data and the structure learning was performed on the itemset, then the association rule analysis results were used to correct it, thereby determining the prior knowledge based on frequent item mining and association rule analysis. Secondly, a Bayesian Dirichlet equivalent uniform (BDeu) scoring algorithm was proposed combining with prior knowledge for Bayesian network structure learning. Finally, experiments were carried out on 6 public standard datasets to compare the Hamming distance between the structure with/without prior and the original network structure. The results show that the proposed algorithm can effectively improve the structure learning accuracy of Bayesian network compared to the original BDue scoring algorithm.

Key words: Bayesian network, structure learning, association rule analysis, Apriori algorithm, Bayesian Dirichlet equivalent uniform (BDeu) score

中图分类号: