Journal of Computer Applications ›› 2011, Vol. 31 ›› Issue (04): 1114-1116.DOI: 10.3724/SP.J.1087.2011.01114

• Artificial intelligence • Previous Articles     Next Articles

Classification method for SVDD based on information entropy

Wei-cheng HE,Jing-long FANG   

  1. School of Computer Science, Hangzhou Dianzi University, Hangzhou Zhejiang 310018, China
  • Received:2010-09-29 Revised:2010-11-25 Online:2011-04-08 Published:2011-04-01
  • Contact: Wei-cheng HE

基于信息熵的支持向量数据描述分类

何伟成,方景龙   

  1. 杭州电子科技大学 计算机学院,杭州310018
  • 通讯作者: 何伟成
  • 作者简介:何伟成(1986-),男,浙江金华人,硕士研究生,主要研究方向:模式识别、支持向量机;
    方景龙(1964-),男,江西景德镇人,研究员,主要研究方向:模式识别、支持向量机。

Abstract: Most of Support Vector Data Description (SVDD) methods have blindness and bias issues when working on two-class problems. The authors proposed a new SVDD method based on information entropy. In this algorithm, firstly, the entropy values were resolved respectively of the two classes of samples. Secondly, according to the size of the value, one class was placed inside the ball. Finally, the penalty was given based on the information provided by the sizes of the two sample data and their entropy values. The efficiency of this algorithm was verified by using artificial data and UCI datasets for the data imbalanced classification problem. The experimental results on artificial data sets and UCI data sets show the feasibility and effectiveness of the proposed method.

Key words: information entropy, distribution character, Support Vector Data Description (SVDD), classification

摘要: 针对现有的支持向量数据描述(SVDD)在解决分类问题时通常存在盲目性和有偏性,在研究信息熵和SVDD分类理论的基础上,提出了改进两类分类问题的E-SVDD算法。首先对两类样本数据分别求出其熵值;然后根据熵值大小决定将哪类放在球内;最后结合两类样本容量以及各自的熵值所提供的分布信息,对SVDD算法中的C值重新进行定义。采用该算法对人工样本集和UCI数据集进行实验,实验结果验证了算法的可行性和有效性。

关键词: 信息熵, 分布特性, 支持向量数据描述, 分类

CLC Number: