计算机应用 ›› 2011, Vol. 31 ›› Issue (04): 1114-1116.DOI: 10.3724/SP.J.1087.2011.01114

• 人工智能 • 上一篇    下一篇

基于信息熵的支持向量数据描述分类

何伟成,方景龙   

  1. 杭州电子科技大学 计算机学院,杭州310018
  • 收稿日期:2010-09-29 修回日期:2010-11-25 发布日期:2011-04-08 出版日期:2011-04-01
  • 通讯作者: 何伟成
  • 作者简介:何伟成(1986-),男,浙江金华人,硕士研究生,主要研究方向:模式识别、支持向量机;
    方景龙(1964-),男,江西景德镇人,研究员,主要研究方向:模式识别、支持向量机。

Classification method for SVDD based on information entropy

Wei-cheng HE,Jing-long FANG   

  1. School of Computer Science, Hangzhou Dianzi University, Hangzhou Zhejiang 310018, China
  • Received:2010-09-29 Revised:2010-11-25 Online:2011-04-08 Published:2011-04-01
  • Contact: Wei-cheng HE

摘要: 针对现有的支持向量数据描述(SVDD)在解决分类问题时通常存在盲目性和有偏性,在研究信息熵和SVDD分类理论的基础上,提出了改进两类分类问题的E-SVDD算法。首先对两类样本数据分别求出其熵值;然后根据熵值大小决定将哪类放在球内;最后结合两类样本容量以及各自的熵值所提供的分布信息,对SVDD算法中的C值重新进行定义。采用该算法对人工样本集和UCI数据集进行实验,实验结果验证了算法的可行性和有效性。

关键词: 信息熵, 分布特性, 支持向量数据描述, 分类

Abstract: Most of Support Vector Data Description (SVDD) methods have blindness and bias issues when working on two-class problems. The authors proposed a new SVDD method based on information entropy. In this algorithm, firstly, the entropy values were resolved respectively of the two classes of samples. Secondly, according to the size of the value, one class was placed inside the ball. Finally, the penalty was given based on the information provided by the sizes of the two sample data and their entropy values. The efficiency of this algorithm was verified by using artificial data and UCI datasets for the data imbalanced classification problem. The experimental results on artificial data sets and UCI data sets show the feasibility and effectiveness of the proposed method.

Key words: information entropy, distribution character, Support Vector Data Description (SVDD), classification

中图分类号: