• •    

基于对偶随机投影的线性核支持向量机研究

席茜1,张凤琴2,李小青1,管桦1,陈桂茸1,王梦非1   

  1. 1. 空军工程大学信息与导航学院
    2. 空军工程大学电讯工程学院
  • 收稿日期:2016-11-10 修回日期:2016-12-29 发布日期:2016-12-29
  • 通讯作者: 席茜

Research on Linear Kernel Support Vector Machine Based on Dual Random Projection

  • Received:2016-11-10 Revised:2016-12-29 Online:2016-12-29

摘要: 针对大型支持向量机经随机投影特征降维后分类精度下降的问题,结合对偶恢复理论,提出了面向大规模分类问题的基于对偶随机投影的线性核支持向量机(drp-LSVM)。对于drp-LSVM,本文首先分析论证了其相关几何性质,证明在保持了与基于随机投影降维的支持向量机(rp-LSVM)相近几何优势的同时,其划分超平面更接近于用全部数据训练得到的原始分类器。之后针对提出的drp-LSVM快速求解问题,改进了传统的SMO算法,设计了基于改进SMO算法的drp-LSVM分类器。最后实验证明drp-LSVM在继承rp-LSVM优点的同时,减少了分类误差,提高了训练精度,并且各项性能评价更接近于用原始数据训练得到的分类器;设计的基于改进SMO算法分类器不但可以减少内存消耗,同时可以拥有较好的训练精度。

关键词: 机器学习, 支持向量机, 随机投影, SMO算法, 降维

Abstract: Classification accuracy of large-scale SVM will decline after random-projection-based feature dimension reduction. To remove this shortcoming, Drp-LSVM against large-scale classification problems was proposed in this paper with an introduction of the dual recovery theory. For drp-LSVM, its relevant geometric properties were analyzed and demonstrated here, in aim of proving that Drp-LSVM has maintained the similar geometric advantages of rp-LSVM, and its divided hyperplane is more closer to the primitive classifier trained by complete data. As to the fast solution to drp-LSVM, traditional SMO algorithm was given and a drp-LSVM classifier based on improved SMO algorithm was completed. Finally, it was proved that drp-LSVM has inherited the advantages of rp-LSVM, reduced classification error, improved training accuracy, and all its performance indexes are more close to the classifier trained by primitive data; The classifier designed based on the improved SMO algorithm can reduce memory consumption, and achieve better training accuracy.

Key words: machine learning, Support Vector Machine(SVM), Random Projection, Sequential Minimal Optimization(SMO), dimensionality reduction

中图分类号: