Journal of Computer Applications ›› 2017, Vol. 37 ›› Issue (6): 1680-1685.DOI: 10.11772/j.issn.1001-9081.2017.06.1680

Previous Articles     Next Articles

Linear kernel support vector machine based on dual random projection

XI Xi, ZHANG Fengqin, LI Xiaoqing, GUAN Hua, CHEN Guirong, WANG Mengfei   

  1. Information and Navigation College, Air Force Engineering University, Xi'an Shaanxi 710077, China
  • Received:2016-11-10 Revised:2016-12-29 Online:2017-06-10 Published:2017-06-14
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (71503260), the Natural Science Foundation of Shaanxi Province (2014JM8345).


席茜, 张凤琴, 李小青, 管桦, 陈桂茸, 王梦非   

  1. 空军工程大学 信息与导航学院, 西安 710077
  • 通讯作者: 席茜
  • 作者简介:席茜(1993-),女,山西新绛人,硕士研究生,CCF会员,主要研究方向:数据挖掘、机器学习;张凤琴(1964-),女,山西芮城人,副教授,硕士,CCF会员,主要研究方向:数据挖掘、复杂网络、分布式数据库;李小青(1982-),女,陕西泾阳人,讲师,博士,主要研究方向:数据智能处理;管桦(1963-),男,湖北孝感人,教授,硕士,主要研究方向:指挥自动化;陈桂茸(1970-),女,陕西合阳人,讲师,硕士,主要研究方向:复杂网络;王梦非(1992-),男,山东济南人,硕士研究生,主要研究方向:复杂网络、机器学习。
  • 基金资助:

Abstract: Aiming at the low classification accuracy problem of large-scale Support Vector Machine (SVM) after random-projection-based feature dimensionality reduction, Linear kernel SVM based on dual random projection (drp-LSVM) for large-scale classification problems was proposed with the introduction of the dual recovery theory. Firstly, the relevant geometric properties of drp-LSVM were analyzed and demonstrated. It's proved that, with maintaining the similar geometric advantages of Linear kernel SVM based on dual random projection (rp-LSVM), the divided hyperplane of drp-LSVM was more close to the primitive classifier trained by complete data. Then, in view of the fast solution to drp-LSVM, the traditional Sequential Minimal Optimization (SMO) algorithm was improved and the drp-LSVM classifier based on improved SMO algorithm was completed. Finally, the experimental results show that, drp-LSVM inherits the advantages of rp-LSVM, reduces classification error, improves training accuracy, and all its performance indexes are more close to the classifier trained by primitive data; the classifier designed based on the improved SMO algorithm can reduce memory consumption and achieve higher training accuracy.

Key words: machine learning, Support Vector Machine (SVM), random projection, Sequential Minimal Optimization (SMO) algorithm, dimensionality reduction

摘要: 针对大型支持向量机(SVM)经随机投影特征降维后分类精度下降的问题,结合对偶恢复理论,提出了面向大规模分类问题的基于对偶随机投影的线性核支持向量机(drp-LSVM)。首先,分析论证了drp-LSVM相关几何性质,证明了在保持与基于随机投影降维的支持向量机(rp-LSVM)相近几何优势的同时,其划分超平面更接近于用全部数据训练得到的原始分类器。然后,针对提出的drp-LSVM快速求解问题,改进了传统的序列最小优化(SMO)算法,设计了基于改进SMO算法的drp-LSVM分类器。最后实验结果表明,drp-LSVM在继承rp-LSVM优点的同时,减小了分类误差,提高了训练精度,并且各项性能评价更接近于用原始数据训练得到的分类器;设计的基于改进SMO算法的分类器不但可以减少内存消耗,同时可以拥有较高的训练精度。

关键词: 机器学习, 支持向量机, 随机投影, 序列最小优化算法, 降维

CLC Number: