计算机应用 ›› 2013, Vol. 33 ›› Issue (09): 2553-2556.DOI: 10.11772/j.issn.1001-9081.2013.09.2553

• 人工智能 • 上一篇    下一篇

基于Fisher类内散度的支持向量机分类面修正方法

杨婷1,孟相如1,温祥西1,2,伍文1   

  1. 1. 空军工程大学 信息与导航学院,西安 710077;
    2. 空军工程大学 空管与领航学院,西安 710051
  • 收稿日期:2013-03-21 修回日期:2013-04-26 出版日期:2013-09-01 发布日期:2013-10-18
  • 通讯作者: 杨婷
  • 作者简介:杨婷(1989-),女,湖北应城人,硕士研究生,主要研究方向:网络故障诊断;
    孟相如(1963-),男,陕西西安人,教授,博士生导师,主要研究方向:宽带通信网络;
    温祥西(1984-),男,江苏连云港人,博士,主要研究方向:网络故障诊断;
    伍文(1985-),女,陕西西安人,博士研究生,主要研究方向:网络可生存性研究。
  • 基金资助:

    国家自然科学基金资助项目

Optimal hyperplane modification of support vector machine based on Fisher within-class scatter

YANG Ting1,MENG Xiangru1,WEN Xiangxi1,2,WU Wen1   

  1. 1. College of Information and Navigation, Air Force Engineering University, Xi'an Shaanxi 710077, China;
    2. College of Air Traffic Control and Navigation, Air Force Engineering University, Xi'an Shaanxi 710051, China
  • Received:2013-03-21 Revised:2013-04-26 Online:2013-10-18 Published:2013-09-01
  • Contact: YANG Ting
  • Supported by:

    The National Natural Science Fund

摘要: 针对支持向量机(SVM)训练不平衡样本数据产生最优分类面的偏移会降低分类模型泛化性的问题,提出一种基于Fisher类内散度平均分布比的分类面修正方法。对样本数据进行SVM训练后获得分类面的法向量;通过计算两类样本在该法向量方向上的Fisher类内散度来评价这两类样本的分布情况;依据类内散度综合考虑样本个数所得到的平均分布比重新修正最优分类面的位置。在benchmarks数据集上的实验结果说明该方法能够提高SVM分类模型在处理不均衡数据集时对于少数类的识别率,从而有助于提高模型的泛化性。

关键词: 支持向量机, 不平衡数据, 修正, Fisher类内散度

Abstract: The generalization of Support Vector Machines (SVM) will decline when the training data sets get imbalanced distribution. A modification method of the optimal hyperplane based on average divergence ratio according to Fisher within-class scatter was proposed to solve the problem. The normal vector of the optimal hyperplane was got after SVM training. The Fisher within-class scatter was introduced to evaluate the distribution of the two classes. On this basis, the optimal hyperplane was modified by the ratio of the average distribution scatter that was obtained according to the number of samples. The experimental results on benchmarks data sets show that the proposed method improves the classification accuracy of the class with less training data, so as to improve the SVM's generalization.

Key words: Support Vector Machine (SVM), imbalanced data, modification, Fisher within-class scatter

中图分类号: