基于神经正切核的多核学习方法

doi:10.11772/j.issn.1001-9081.2021060998

《计算机应用》唯一官方网站 ›› 2021, Vol. 41 ›› Issue (12): 3462-3467.DOI: 10.11772/j.issn.1001-9081.2021060998

• 第十八届中国机器学习会议(CCML 2021) • 上一篇

基于神经正切核的多核学习方法

王梅¹^,², 许传海¹, 刘勇³^,⁴()

^1.东北石油大学计算机与信息技术学院，黑龙江大庆 163318
^2.黑龙江省石油大数据与智能分析重点实验室（东北石油大学），黑龙江大庆 163318
^3.中国人民大学高瓴人工智能学院，北京 100872
^4.大数据管理与分析方法研究北京市重点实验室（中国人民大学），北京 100872

收稿日期:2021-05-12 修回日期:2021-06-29 接受日期:2021-07-05 发布日期:2021-12-28 出版日期:2021-12-10
通讯作者: 刘勇
作者简介:王梅（1976—），女，河北保定人，教授，博士，CCF会员，主要研究方向：机器学习、核方法、模型选择
许传海（1998—），男，黑龙江鸡西人，硕士研究生，CCF会员，主要研究方向：深度核学习；
基金资助:
国家自然科学基金面上项目(51774090);黑龙江省博士后科研启动金资助项目(LBH-Q20080);黑龙江省自然科学基金资助项目(LH2020F003);黑龙江省高等教育教学改革重点委托项目(SJGZ20190011)

Multi-kernel learning method based on neural tangent kernel

Mei WANG¹^,², Chuanhai XU¹, Yong LIU³^,⁴()

^1.School of Computer and Information Technology，Northeast Petroleum University，Daqing Heilongjiang 163318，China
^2.Heilongjiang Key Laboratory of Petroleum Big Data and Intelligent Analysis （Northeast Petroleum University），Daqing Heilongjiang 163318，China
^3.Gaoling School of Artificial Intelligence，Renmin University of China，Beijing 100872，China
^4.Beijing Key Laboratory of Big Data Management and Analysis Methods （Renmin University of China），Beijing 100872，China

Received:2021-05-12 Revised:2021-06-29 Accepted:2021-07-05 Online:2021-12-28 Published:2021-12-10
Contact: Yong LIU
About author:WANG Mei， born in 1976， Ph. D.， professor. Her research interests include machine learning， kernel method， model selection.
XU Chuanhai， born in 1998， M. S. candidate. His interests include deep kernel learning.
Supported by:
the Surface Program of National Natural Science Foundation of China(51774090);the Postdoctoral Research Startup Foundation of Heilongjiang Province(LBH-Q20080);the Natural Science Foundation of Heilongjiang Province(LH2020F003);the Higher Education Teaching Reform Key Entrusted Project of Heilongjiang Province(SJGZ20190011)

摘要/Abstract

摘要：

多核学习方法是一类重要的核学习方法，但大多数多核学习方法存在如下问题：多核学习方法中的基核函数大多选择传统的具有浅层结构的核函数，在处理数据规模大且分布不平坦的问题时表示能力较弱；现有的多核学习方法的泛化误差收敛率大多为 $O 1 / n$ ，收敛速度较慢。为此，提出了一种基于神经正切核（NTK）的多核学习方法。首先，将具有深层次结构的NTK作为多核学习方法的基核函数，从而增强多核学习方法的表示能力。然后，根据主特征值比例度量证明了一种收敛速率可达 $O 1 / n$ 的泛化误差界；在此基础上，结合核对齐度量设计了一种全新的多核学习算法。最后，在多个数据集上进行了实验，实验结果表明，相比Adaboost和K近邻（KNN）等分类算法，新提出的多核学习算法具有更高的准确率和更好的表示能力，也验证了所提方法的可行性与有效性。

关键词: 机器学习, 多核学习, 神经正切核, 核对齐, 主特征值比例

Abstract:

Multi-kernel learning method is an important type of kernel learning method， but most of multi-kernel learning methods have the following problems： most of the basis kernel functions in multi-kernel learning methods are traditional kernel functions with shallow structure， which have weak representation ability when dealing with the problems of large data scale and uneven distribution； the generalization error convergence rates of the existing multi-kernel learning methods are mostly $O 1 / n$ ， and the convergence speeds are slow. Therefore， a multi-kernel learning method based on Neural Tangent Kernel （NTK） was proposed. Firstly， the NTK with deep structure was used as the basis kernel function of the multi-kernel learning method， so as to enhance the representation ability of the multi-kernel learning method. Then， a generalization error bound with a convergence rate of $O 1 / n$ was proved based on the measure of principal eigenvalue ratio. On this basis， a new multi-kernel learning algorithm was designed in combination with the kernel alignment measure. Finally， experiments were carried out on several datasets. Experimental results show that compared with classification algorithms such as Adaboost and K-Nearest Neighbor （KNN）， the newly proposed multi-kernel learning algorithm has higher accuracy and better representation ability， which also verifies the feasibility and effectiveness of the proposed method.

Key words: machine learning, multi-kernel learning, Neural Tangent Kernel (NTK), kernel-target alignment, principal eigenvalue ratio

中图分类号:

TP391

王梅, 许传海, 刘勇. 基于神经正切核的多核学习方法[J]. 计算机应用, 2021, 41(12): 3462-3467.

Mei WANG, Chuanhai XU, Yong LIU. Multi-kernel learning method based on neural tangent kernel[J]. Journal of Computer Applications, 2021, 41(12): 3462-3467.

图/表 3

参考文献 40

1	ZHANG T. An introduction to support vector machines and other kernel-based learning methods［J］. AI Magazine， 2001， 22（2）： 103-104.
2	XIE J H. Kernel optimization of LS-SVM based on damage detection for smart structures［C］// Proceedings of the 2nd IEEE International Conference on Computer Science and Information Technology. Piscataway： IEEE， 2009： 406-409. 10.1109/iccsit.2009.5234791
3	HUANG C R， CHEN Y T， CHEN W Y， et al. Gastroesophageal reflux disease diagnosis using hierarchical heterogeneous descriptor fusion support vector machine［J］. IEEE Transactions on Biomedical Engineering， 2016， 63（3）： 588-599. 10.1109/tbme.2015.2466460
4	PENG S L， HU Q H， CHEN Y L， et al. Improved support vector machine algorithm for heterogeneous data［J］. Pattern Recognition， 2015， 48（6）： 2072-2083. 10.1016/j.patcog.2014.12.015
5	SONNENBURG S， RÄTSCH G， SCHÄFER C， et al. Large scale multiple kernel learning［J］. Journal of Machine Learning Research， 2006， 7： 1531-1565. 10.1186/1471-2105-7-s1-s9
6	XIAO Y L， ZHONG S P. An improved online multiple kernel classification algorithm based on double updating online learning［C］// Proceedings of the 2014 International Conference on Cloud Computing and Internet of Things. Piscataway： IEEE， 2014： 109-113. 10.1109/cciot.2014.7062516
7	BACH F R. Consistency of the group lasso and multiple kernel learning［J］. Journal of Machine Learning Research， 2008， 9： 1179-1225. 10.1145/1390156.1390161
8	CORTES C， MOHRI M， ROSTAMIZADEH A. Learning sequence kernels［C］// Proceedings of the 2008 IEEE Workshop on Machine Learning for Signal Processing. Piscataway： IEEE， 2008： 2-8. 10.1109/mlsp.2008.4685446
9	KLOFT M， BREFELD U， LASKOV P， et al. Non-sparse multiple kernel learning［EB/OL］. ［2021-03-20］..
10	ZHENG D N， WANG J X， ZHAO Y N. Non-flat function estimation with a multi-scale support vector regression［J］. Neurocomputing， 2006， 70（1/2/3）： 420-429. 10.1016/j.neucom.2005.12.128
11	LANCKRIET G R G， CRISTIANINI N， BARTLETT P， et al. Learning the kernel matrix with semidefinite programming［J］. Journal of Machine Learning Research， 2002， 5： 27-72.
12	GHIASI-SHIRAZI K， SAFABAKHSH R， SHAMSI M. Learning translation invariant kernels for classification［J］. Journal of Machine Learning Research， 2014， 11： 1353-1390.
13	汪洪桥，孙富春，蔡艳宁，等. 多核学习方法［J］. 自动化学报， 2010， 36（8）：1037-1050. 10.3724/SP.J.1004.2010.01037
	WANG H Q， SUN F C， CAI Y N， et al. On multiple kernel learning methods［J］. Acta Automatica Sinica， 2010， 36（8）： 1037-1050. 10.3724/SP.J.1004.2010.01037
14	BENNETT K P， MOMMA M， EMBRECHTS M J. MARK： a boosting algorithm for heterogeneous kernel models［C］// Proceedings of 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York： ACM， 2002： 24-31. 10.1145/775047.775051
15	SONNENBURG S， RÄTSCH G， SCHÄFER C. A general and eﬃcient multiple kernel learning algorithm［C］// Proceedings of the 18th International Conference on Neural Information Processing Systems. Cambridge： MIT Press， 2005： 1273-1280. 10.1007/11415770_30
16	RAKOTOMAMONJY A， BACH F R， CANU S， et al. SimpleMKL［J］. Journal of Machine Learning Research， 2008， 9（11）： 2491-2521.
17	ARGYRIOU A， HAUSER R， MICCHELLI C A， et al. A DC-programming algorithm for kernel selection［C］// Proceedings of the 23rd International Conference on Machine Learning. New York： ACM， 2006： 41-48. 10.1145/1143844.1143850
18	ALIOSCHA-PEREZ M， OVENEKE M C， SAHLI H. SVRG-MKL： a fast and scalable multiple kernel learning solution for features combination in multi-class classification problems［J］. IEEE Transactions on Neural Networks and Learning Systems， 2020， 31（5）： 1710-1723. 10.1109/tnnls.2019.2922123
19	WANG X M， WANG S T， DU Y J， et al. Minimum class variance multiple kernel learning［J］. Knowledge-Based Systems， 2020， 208（5）： No.106469. 10.1016/j.knosys.2020.106469
20	LIU X W， WANG L， ZHU X Z， et al. Absent multiple kernel learning algorithms［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2020， 42（6）： 1303-1316. 10.1109/tpami.2019.2895608
21	王梅，薛成龙，张强. 基于秩空间差异的多核组合方法［J］. 山东大学学报（工学版）， 2021， 51（1）：108-113.
	WANG M， XUE C L， ZHANG Q. Multi-kernel combination method based on rank spatial difference［J］. Journal of Shandong University （Engineering Science）， 2021， 51（1）： 108-113.
22	贾涵，连晓峰，潘兵. 基于模糊松弛约束的外观缺陷多核学习技术［J］. 测控技术， 2019， 38（8）：43-47，73. 10.19708/j.ckjs.2019.08.009
	JIA H， LIAN X F， PAN B. Appearance defects multiple kernel learning technology based on fuzzy relaxation constraints［J］. Measurement and Control Technology， 2019， 38（8）： 43-47， 73. 10.19708/j.ckjs.2019.08.009
23	HE Q， ZHANG Q S， WANG H Y. Kernel-target alignment based multiple kernel one-class support vector machine［C］// Proceedings of the 2019 IEEE International Conference on Systems， Man and Cybernetics. Piscataway： IEEE， 2019： 2083-2088. 10.1109/smc.2019.8914503
24	KOLTCHINSKII V， PANCHENKO D. Rademacher processes and bounding the risk of function learning［J］. GINÉ E， MASON D M， WELLNER J A. Progress in Probability II， PRPR 47. Boston： Birkhäuser， 2000： 443-459. 10.1007/978-1-4612-1358-1_29
25	BARTLETT P L， MENDELSON S. Rademacher and Gaussian complexities： risk bounds and structural results［C］// Proceedings of the 2001 International Conference on Computational Learning Theory， LNCS2111. Berlin： Springer， 2001： 224-240.
26	KLOFT M， BLANCHARD G. On the convergence rate of ℓ_p-norm multiple kernel learning［J］. Journal of Machine Learning Research， 2012， 13： 2465-2501.
27	CORTES C， KLOFT M， MOHRI M. Learning kernels using local Rademacher complexity［C］// Proceedings of the 26th International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2013： 2760-2768.
28	LIU Y， LIAO S Z. Eigenvalues ratio for kernel selection of kernel methods［C］// Proceedings of the 29th AAAI Conference on Artificial Intelligence. Palo Alto， CA： AAAI Press， 2015： 2814-2820. 10.1609/aaai.v33i01.33013462
29	LIU Y， LIAO S Z， LIN H L， et al. Infinite kernel learning： generalization bounds and algorithms［C］// Proceedings of the 31st AAAI Conference on Artificial Intelligence. Palo Alto， CA： AAAI Press， 2017： 2280-2286. 10.1609/aaai.v34i04.5892
30	WILLIAMS C K I. Computing with infinite networks［C］// Proceedings of the 9th International Conference on Neural Information Processing Systems. Cambridge： MIT Press， 1996：295-301.
31	LEE J， BAHRI Y， NOVAK R， et al. Deep neural networks as Gaussian processes［EB/OL］. （2018-03-03）［2020-12-18］..
32	JACOT A， GABRIEL F， HONGLER C. Neural tangent kernel： convergence and generalization in neural networks［C］// Proceedings of the 32nd International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2018： 8580-8589.
33	LEE J， XIAO L， SCHOENHOLZ S S， et al. Wide neural networks of any depth evolve as linear models under gradient descent［EB/OL］. （2019-12-08）［2020-12-20］.. 10.1088/1742-5468/abc62b
34	ARORA S， DU S S， HU W， et al. On exact computation with an infinitely wide neural net［EB/OL］. （2019-11-04）［2020-12-24］..
35	HUANG K X， WANG Y Q， TAO M L， et al. Why do deep residual networks generalize better than deep feedforward networks？ — a neural tangent kernel perspective［EB/OL］. （2020-12-22）［2020-12-30］..
36	DU S S， HOU K C， PÓCZOS B， et al. Graph neural tangent kernel： fusing graph neural networks with graph kernels［EB/OL］. （2019-11-03）［2020-11-04］..
37	LI Z Y， WANG R S， YU D L， et al. Enhanced convolutional neural tangent kernels［EB/OL］. （2019-11-03）［2021-01-05］.. 10.1109/icassp.2019.8682265
38	ARORA S， DU S S， LI Z Y， et al. Harnessing the power of infinitely wide deep nets on small-data tasks［EB/OL］. （2019-10-27）［2021-01-08］..
39	CRISTIANINI N， SHAWE-TAYLOR J， ELISSEEFF A， et al. On kernel-target alignment［C］// Proceedings of the 14th International Conference on Neural Information Processing Systems： Natural and Synthetic. Cambridge： MIT Press， 2001： 367-373. 10.7551/mitpress/1120.003.0052
40	BARTLETT P L， BOUSQUET O， MENDELSON S. Local Rademacher complexities［J］. The Annals of Statistics， 2005， 33（4）： 1497-1537. 10.1214/009053605000000282

数据集名称	类别数	实例数	维数
car	4	1 728	6
cmc	3	1 473	9
red-wine	6	1 599	11
nursery	3	12 630	8
shoppers	2	12 330	17
avila	2	10 430	10

数据集名称	类别数	实例数	维数
car	4	1 728	6
cmc	3	1 473	9
red-wine	6	1 599	11
nursery	3	12 630	8
shoppers	2	12 330	17
avila	2	10 430	10

核函数	car数据集			shoppers数据集
核函数	准确率	精确率	召回率	准确率	精确率	召回率
rbf	70.9	37.0	39.0	88.5	81.0	71.0
poly	75.1	35.0	31.0	87.6	83.0	65.0
ntk₁	87.0	64.0	62.0	89.1	85.0	73.0
ntk₂	86.1	75.0	70.0	89.9	86.0	76.0
ntk₃	87.3	74.0	67.0	89.3	85.0	74.0

核函数	car数据集			shoppers数据集
核函数	准确率	精确率	召回率	准确率	精确率	召回率
rbf	70.9	37.0	39.0	88.5	81.0	71.0
poly	75.1	35.0	31.0	87.6	83.0	65.0
ntk₁	87.0	64.0	62.0	89.1	85.0	73.0
ntk₂	86.1	75.0	70.0	89.9	86.0	76.0
ntk₃	87.3	74.0	67.0	89.3	85.0	74.0

算法	准确率
算法	cmc	red-wine	nursery	shoppers	avila
Adaboost	49.1	74.1	92.3	88.6	69.8
KNN	48.4	64.3	97.2	87.1	75.2
MKL（r+p）	51.9	78.4	98.3	89.0	75.4
NTK-MKL	53.4	74.6	99.0	89.8	79.9

基于神经正切核的多核学习方法

Multi-kernel learning method based on neural tangent kernel

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 3

参考文献 40

相关文章 15

编辑推荐

Metrics

[1]	郭棉, 张锦友. 移动边缘计算环境中面向机器学习的计算迁移策略[J]. 计算机应用, 2021, 41(9): 2639-2645.
[2]	毛铭泽, 曹芮浩, 闫春钢. 基于权值多样性的半监督分类算法[J]. 计算机应用, 2021, 41(9): 2473-2480.
[3]	秦斌斌, 彭良康, 卢向明, 钱江波. 司机分心驾驶检测研究进展[J]. 计算机应用, 2021, 41(8): 2330-2337.
[4]	姜倩玉, 王凤英, 贾立鹏. 基于感知哈希算法和特征融合的恶意代码检测方法[J]. 计算机应用, 2021, 41(3): 780-785.
[5]	秦静, 左长青, 汪祖民, 季长清, 王宝凤. 基于堆叠分类器的心电异常监测模型设计[J]. 计算机应用, 2021, 41(3): 887-890.
[6]	孟祥瑞, 杨文忠, 王婷. 基于图文融合的情感分析研究综述[J]. 《计算机应用》唯一官方网站, 2021, 41(2): 307-317.
[7]	成科扬, 孟春运, 王文杉, 师文喜, 詹永照. 解耦表征学习研究进展[J]. 《计算机应用》唯一官方网站, 2021, 41(12): 3409-3418.
[8]	楼豪杰, 郑元林, 廖开阳, 雷浩, 李佳. 基于Siamese-YOLOv4的印刷品缺陷目标检测[J]. 《计算机应用》唯一官方网站, 2021, 41(11): 3206-3212.
[9]	刘晓龙, 王士同. 渐进式分离的开放集模糊域自适应算法[J]. 《计算机应用》唯一官方网站, 2021, 41(11): 3127-3131.
[10]	王雅辉, 钱宇华, 刘郭庆. 基于模糊优势互补互信息的有序决策树算法[J]. 计算机应用, 2021, 41(10): 2785-2792.
[11]	蒋阳升, 王胜男, 涂家祺, 李莎, 王红军. 面向高铁站的热舒适度和能耗综合预测[J]. 计算机应用, 2021, 41(1): 249-257.
[12]	朱琳, 于海涛, 雷新宇, 刘静, 王若凡. 基于MRI图像的阿尔茨海默症患者脑网络特征识别算法[J]. 计算机应用, 2020, 40(8): 2455-2459.
[13]	梁登高, 周安民, 郑荣锋, 刘亮, 丁建伟. 基于大小突发块划分的微信支付行为识别模型[J]. 计算机应用, 2020, 40(7): 1970-1976.
[14]	徐周波, 杨健, 刘华东, 黄文文. 基于XGBoost与拓扑结构信息的蛋白质复合物识别算法[J]. 计算机应用, 2020, 40(5): 1510-1514.
[15]	张俊升, 徐晶晶, 余伟. 面部美化图像质量无参考评价方法[J]. 计算机应用, 2020, 40(4): 1184-1190.