Abstract:Aiming at the low classification accuracy problem of large-scale Support Vector Machine (SVM) after random-projection-based feature dimensionality reduction, Linear kernel SVM based on dual random projection (drp-LSVM) for large-scale classification problems was proposed with the introduction of the dual recovery theory. Firstly, the relevant geometric properties of drp-LSVM were analyzed and demonstrated. It's proved that, with maintaining the similar geometric advantages of Linear kernel SVM based on dual random projection (rp-LSVM), the divided hyperplane of drp-LSVM was more close to the primitive classifier trained by complete data. Then, in view of the fast solution to drp-LSVM, the traditional Sequential Minimal Optimization (SMO) algorithm was improved and the drp-LSVM classifier based on improved SMO algorithm was completed. Finally, the experimental results show that, drp-LSVM inherits the advantages of rp-LSVM, reduces classification error, improves training accuracy, and all its performance indexes are more close to the classifier trained by primitive data; the classifier designed based on the improved SMO algorithm can reduce memory consumption and achieve higher training accuracy.
席茜, 张凤琴, 李小青, 管桦, 陈桂茸, 王梦非. 基于对偶随机投影的线性核支持向量机[J]. 计算机应用, 2017, 37(6): 1680-1685.
XI Xi, ZHANG Fengqin, LI Xiaoqing, GUAN Hua, CHEN Guirong, WANG Mengfei. Linear kernel support vector machine based on dual random projection. Journal of Computer Applications, 2017, 37(6): 1680-1685.
[1] CORTES C, VAPNIK V. Support-vector networks[J]. Machine Learning, 1995, 20(3):273-297. [2] KUMAR K, BHATTACHARYYA C, HARIHARAN R. A randomized algorithm for large scale support vector learning[EB/OL].[2016-10-09]. http://hariharan-ramesh.com/papers/krichiram_nips_07.pdf. [3] JETHAVA V, SURESH K, BHATTACHARYYA C, et al. Randomized algorithms for large scale SVMs[EB/OL].[2016-10-09]. https://www.researchgate.net/publication/45873558_Randomized_Algorithms_for_Large_scale_SVMs. [4] PAUL S, BOUTSIDIS C, MAGDON-ISMAIL M, et al. Random projections for linear support vector machines[J]. ACM Transactions on Knowledge Discovery from Data, 2014, 8(4):Article No. 22. [5] ZHANG L J, MAHDAVI M, JIN R, et al. Recovering the optimal solution by dual random projection[J]. Journal of Machine Learning Research, 2012, 30:135-157. [6] 周志华.机器学习[M].北京:清华大学出版社,2016:121-145.(ZHOU Z H. Machine Learning[M]. Beijing:Tsinghua University Press, 2016:121-145.) [7] 刘红,刘蓉,李书玲.基于随机投影的加速度手势识别[J].计算机应用,2015,35(1):189-193.(LIU H, LIU R, LI S L. Acceleration gesture recognition based on random projection[J]. Journal of Computer Applications, 2015, 35(1):189-193.) [8] 王萍,蔡思佳,刘宇.基于随机投影技术的矩阵填充算法的改进[J].计算机应用,2014,34(6):1587-1590.(WANG P, CAI S J, LIU Y. Improvement of matrix completion algorithm based on random projection[J]. Journal of Computer Applications, 2014, 34(6):1587-1590.) [9] PLATT J C. Fast training of support vector machines using sequential minimal optimization[M]. Cambridge, MA:MIT Press, 1999:185-208. [10] CHANG C C, LIN C J. LIBSVM:a library for support vector machines[J]. ACM Transactions on Intelligent Systems & Technology, 2011, 2(3):Article No. 27. [11] FAN R E, CHANG K W, HSIEH C J, et al. LIBLINEAR:a library for large linear classification[J]. Journal of Machine Learning Research, 2008, 9:1871-1874. [12] GOLUB T R, SLONIM D K, TAMAYO P, et al. Molecular classification of cancer:class discovery and class prediction by gene expression monitoring[J]. Science, 1999, 286(5439):531-537. [13] LEWIS D D, YANG Y, ROSE T G, et al. RCV1:a new benchmark collection for text categorization research[J]. Journal of Machine Learning Research, 2004, 5:361-397.