基于不完全朴素贝叶斯分类模型的垃圾邮件分类模型

计算机应用

基于不完全朴素贝叶斯分类模型的垃圾邮件分类模型

惠孛吴跃

电子科技大学，计算科学与工程学院电子科技大学

收稿日期:2008-09-24 修回日期:1900-01-01 发布日期:2009-03-01 出版日期:2009-03-01
通讯作者: 惠孛

Anti-spam model based on semi-Naive Bayesian classification model

Received:2008-09-24 Revised:1900-01-01 Online:2009-03-01 Published:2009-03-01
Contact: Bei Hui

摘要/Abstract

摘要： 由于朴素贝叶斯分类模型的简单高效，在垃圾邮件分类时可以达到较好的效果；但朴素贝叶斯的条件独立假设割裂了属性之间的关系，影响了分类的准确性。放松朴素贝叶斯分类模型关于属性之间条件独立假设，介绍一种新的基于不完全朴素贝叶斯分类模型的垃圾邮件分类模型，N平均1依赖邮件过滤模型。使用N个1依赖分类模型的平均概率作为分类的预测概率。实验证明，该模型在简单、高效的同时降低了对垃圾邮件分类的错误率。

关键词: 贝叶斯分类, 不完全朴素贝叶斯, 垃圾邮件

Abstract: Because Naive Bayes (NB) classification model is simple and effective, good efficiency can be achieved in antispam applications. On the other hand, the assumption of its attribute independence makes it unable to express its semantic dependence. This paper proposed a new antispam classification model based on semi-NB classification model, averaged on N one-dependence classification model. It relaxed the assumption of condition independence of each attribute. It was assumed that all attributes were dependent on one attribute (1-dependence). The average on N 1-dependence was regarded as the probability of each class label. This method is simple and efficient and decreases the classification error ratio.

Key words: Bayesian classification, semi-Naive Bayes, spam

惠孛吴跃. 基于不完全朴素贝叶斯分类模型的垃圾邮件分类模型[J]. 计算机应用.

Bei Hui . Anti-spam model based on semi-Naive Bayesian classification model[J]. Journal of Computer Applications.

[1]	吴崇数, 林霖, 薛蕴菁, 时鹏. 基于自监督学习的病理图像层次分割[J]. 计算机应用, 2020, 40(6): 1856-1862.
[2]	赵光华, 赖见辉, 陈艳艳, 孙浩冬, 张野. 基于朴素贝叶斯分类的居民出行起讫点识别方法[J]. 计算机应用, 2020, 40(1): 36-42.
[3]	程铃钫, 郭躬德, 陈黎飞. 符号序列多阶Markov分类[J]. 计算机应用, 2017, 37(7): 1977-1982.
[4]	陈斌, 东一舟, 毛明荣. 基于增量学习算法的校园网垃圾邮件检测模型[J]. 计算机应用, 2017, 37(1): 206-211.
[5]	沈承恩, 何军, 邓扬. 基于改进堆叠自动编码机的垃圾邮件分类[J]. 计算机应用, 2016, 36(1): 158-162.
[6]	胡银辉, 陈琳. 大规模InfiniBand网络自学习的故障诊断方法[J]. 计算机应用, 2015, 35(11): 3092-3096.
[7]	李艳涛, 冯伟森. 堆叠去噪自编码器在垃圾邮件过滤中的应用[J]. 计算机应用, 2015, 35(11): 3256-3260.
[8]	杨宇飞戴齐贾真尹红风. 基于弱监督的属性关系抽取方法[J]. 计算机应用, 2014, 34(1): 64-68.
[9]	张毅黄聪罗元. 基于改进朴素贝叶斯分类器的康复训练行为识别方法[J]. 计算机应用, 2013, 33(11): 3187-3189.
[10]	黄国伟许昱玮. 基于用户反馈的混合型垃圾邮件过滤方法[J]. 计算机应用, 2013, 33(07): 1861-1865.
[11]	全亮亮吴卫东. 基于支持向量机和贝叶斯分类的异常检测模型[J]. 计算机应用, 2012, 32(06): 1632-1635.
[12]	刘磊陈兴蜀尹学渊段意吕昭. 基于特征加权朴素贝叶斯算法的网络用户识别[J]. 计算机应用, 2011, 31(12): 3268-3270.
[13]	陶永才薛正元石磊. 基于MapReduce的贝叶斯垃圾邮件过滤机制[J]. 计算机应用, 2011, 31(09): 2412-2416.
[14]	张如艳王士同徐遥. t分布下基于核函数的最大后验概率分类方法[J]. 计算机应用, 2011, 31(04): 1079-1083.
[15]	邓维斌洪智勇. 基于粗糙集的两阶段邮件过滤方法[J]. 计算机应用, 2010, 30(8): 2006-2009.