不同类变量下属性聚类的朴素贝叶斯分类算法

doi:10.3724/SP.J.1087.2011.03072

计算机应用 ›› 2011, Vol. 31 ›› Issue (11): 3072-3074.DOI: 10.3724/SP.J.1087.2011.03072

不同类变量下属性聚类的朴素贝叶斯分类算法

彭兴媛,刘琼荪

重庆大学数学与统计学院，重庆 401331

收稿日期:2011-05-10 修回日期:2011-07-06 发布日期:2011-11-16 出版日期:2011-11-01
通讯作者: 彭兴媛
作者简介:彭兴媛（1985-），女，四川遂宁人，硕士研究生，主要研究方向：数据分析、统计决策；刘琼荪（1956-），女，重庆人，教授，主要研究方向：智能计算、数据挖掘、应用统计。
基金资助:
中央高校基本科研业务费资助项目

Naive Bayesian classification algorithm based on attribute clustering under different classification

PENG Xing-yuan,LIU Qiong-sun

College of Mathematics and Statistics, Chongqing University, Chongqing 401331, China

Received:2011-05-10 Revised:2011-07-06 Online:2011-11-16 Published:2011-11-01
Contact: PENG Xing-yuan

摘要/Abstract

摘要： 朴素贝叶斯（NB）分类算法虽是一种简单且有效的分类方法，但其条件属性独立性假设忽略了属性变量间存在的相关性。考虑到条件独立性假设对分类效果的影响，提出一种新的将条件属性进行聚类的分组技术，不仅避免了传统朴素贝叶斯算法假设各条件属性间独立的这一缺陷，而且反映出了在不同类别情况下条件属性间具有的不同依赖程度。经过对UCI的几个数据集的仿真实验，结果表明了新算法的有效性。

关键词: 朴素贝叶斯, 属性关联程度, 聚类算法, χ2统计量

Abstract: In numerous classification methods, although Naive Bayesian (NB) classification algorithm is simple and effective, its attribute independence assumption ignores the correlation among attributes. To consider the influence of the attribute independence assumption, a new grouping technology which clusters the conditional attributes was proposed. This technology not only overcomes the deficiency arising from the attribute independence assumption of the traditional NB classification algorithm, but also reflects the different correlation intensity among attributes when the classification is different. Simulation results on a variety of UCI data sets illustrate the efficiency of this method.

Key words: Naive Bayesian (NB), attribute correlation intensity, clustering algorithm, chi-square statistic

中图分类号:

TP18

彭兴媛刘琼荪. 不同类变量下属性聚类的朴素贝叶斯分类算法[J]. 计算机应用, 2011, 31(11): 3072-3074.

PENG Xing-yuan LIU Qiong-sun. Naive Bayesian classification algorithm based on attribute clustering under different classification[J]. Journal of Computer Applications, 2011, 31(11): 3072-3074.

[1]	吴崇数, 林霖, 薛蕴菁, 时鹏. 基于自监督学习的病理图像层次分割[J]. 计算机应用, 2020, 40(6): 1856-1862.
[2]	孙建军, 徐岩. 基于加权改进模糊C均值聚类的欠定混合矩阵估计[J]. 计算机应用, 2020, 40(6): 1769-1773.
[3]	赵光华, 赖见辉, 陈艳艳, 孙浩冬, 张野. 基于朴素贝叶斯分类的居民出行起讫点识别方法[J]. 计算机应用, 2020, 40(1): 36-42.
[4]	黄永鑫, 唐雪飞. 基于近邻传播聚类和TANE算法的高校数据中函数依赖的发现[J]. 计算机应用, 2020, 40(1): 90-95.
[5]	龚彦鹭, 吕佳. 结合主动学习和密度峰值聚类的协同训练算法[J]. 计算机应用, 2019, 39(8): 2297-2301.
[6]	丁超, 赵海, 司帅宗, 朱剑. 正常衰老的人脑功能网络演化模型[J]. 计算机应用, 2019, 39(4): 963-971.
[7]	毛伊敏, 刘银萍, 梁田, 毛丁慧. 基于模糊谱聚类的不确定蛋白质相互作用网络功能模块挖掘[J]. 计算机应用, 2019, 39(4): 1032-1040.
[8]	丁成, 王秋萍, 王晓峰. 基于广义反向学习的磷虾群算法及其在数据聚类中的应用[J]. 计算机应用, 2019, 39(2): 336-342.
[9]	刘晓明, 沈明玉, 侯整风. 基于Levy飞行的萤火虫模糊聚类算法[J]. 计算机应用, 2019, 39(11): 3257-3262.
[10]	叶双, 杨晓敏, 严斌宇. 基于自适应锚定邻域回归的图像超分辨率算法[J]. 计算机应用, 2019, 39(10): 3040-3045.
[11]	邱保志, 程栾. 基于拉普拉斯中心性和密度峰值的无参数聚类算法[J]. 计算机应用, 2018, 38(9): 2511-2514.
[12]	邵伦, 周新志, 赵成萍, 张旭. 基于多维网格空间的改进K-means聚类算法[J]. 计算机应用, 2018, 38(10): 2850-2855.
[13]	侯海耀, 钱育蓉, 英昌甜, 张晗, 卢学远, 赵燚. 基于Hilbert-R树分级索引的时空查询算法[J]. 计算机应用, 2018, 38(10): 2869-2874.
[14]	王红, 葛丽娜, 王苏青, 王丽颖, 张翼鹏, 梁竣程. 基于OPTICS聚类的差分隐私保护算法的改进[J]. 计算机应用, 2018, 38(1): 73-78.
[15]	李焱, 刘弘, 郑向伟. 折半聚类算法在基于社会力的人群疏散仿真中的应用[J]. 计算机应用, 2017, 37(5): 1491-1495.

不同类变量下属性聚类的朴素贝叶斯分类算法

Naive Bayesian classification algorithm based on attribute clustering under different classification

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics