Semi-supervised learning algorithm of graph based on label metric learning

doi:10.11772/j.issn.1001-9081.2020060893

Journal of Computer Applications ›› 2020, Vol. 40 ›› Issue (12): 3430-3436.DOI: 10.11772/j.issn.1001-9081.2020060893

• 2020 China Conference on Granular Computing and Knowledge Discovery（CGCKD 2020） • Previous Articles Next Articles

Semi-supervised learning algorithm of graph based on label metric learning

LYU Yali^1,2, MIAO Junzhong¹, HU Weixin¹

1. School of Information, Shanxi University of Finance and Economics, Taiyuan Shanxi 030006, China;
2. Key Laboratory of Computational Intelligence and Chinese Information Processing, Ministry of Education;(Shanxi University), Taiyuan Shanxi 030006, China

Received:2020-06-12 Revised:2020-08-20 Online:2020-12-10 Published:2020-10-20
Supported by:
This work is partially supported by the Natural Science Foundation of Shanxi Province（201801D121115）， the Research Project of Shanxi Scholarship Council of China （2020-095）.

基于标签进行度量学习的图半监督学习算法

吕亚丽^1,2, 苗钧重¹, 胡玮昕¹

1. 山西财经大学信息学院, 太原 030006;
2. 计算智能与中文信息处理教育部重点实验室(山西大学), 太原 030006

通讯作者: 吕亚丽(1975-),女,山西临汾人,副教授,博士,CCF会员,主要研究方向:数据挖掘、机器学习、概率推理。sxlvyali@126.com
作者简介:苗钧重(1993-),男,山西晋中人,硕士研究生,主要研究方向:数据挖掘、机器学习;胡玮昕(1996-),女,山西晋中人,硕士研究生,主要研究方向:数据挖掘、机器学习
基金资助:
山西省自然科学基金资助项目（201801D121115）；山西省回国留学人员科研资助项目（2020-095）。

Abstract

Abstract: Most graph-based semi-supervised learning methods do not use the known label information and the label information obtained from the label propagation process when measuring the similarity between samples. At the same time, these methods have the measurement methods relatively fixed, which cannot effectively measure the similarity between data samples with complex and varied distribution structures. In order to solve the problems, a semi-supervised learning algorithm of graph based on label metric learning was proposed. Firstly, the similarity measurement method of samples was given, and then the similarity matrix was constructed. Secondly, labels were propagated based on the similarity matrix and k samples with low entropy were selected as the new obtained label information. Finally, the similarity measure method was updated by fully using all label information, and this process was repeated until all label information was learned. The proposed algorithm not only uses label information to improve the measurement method of similarity between samples, but also makes full use of intermediate results to reduce the demand for labeled data in the semi-supervised learning. Experimental results on six real datasets show that, compared with three traditional graph-based semi-supervised learning algorithms, the proposed algorithm achieves higher classification accuracy in more than 95% of the cases.

Key words: machine learning, graph-based semi-supervised learning, metric learning, label propagation, similarity matrix

摘要： 大多基于图的半监督学习方法，在样本间相似性度量时没有用到已有的和标签传播过程中得到的标签信息，同时，其度量方式相对固定，不能有效度量出分布结构复杂多样的数据样本间的相似性。针对上述问题，提出了基于标签进行度量学习的图半监督学习算法。首先，给定样本间相似性的度量方式，从而构建相似度矩阵。然后，基于相似度矩阵进行标签传播，筛选出k个低熵样本作为新确定的标签信息。最后，充分利用所有标签信息更新相似性度量方式，重复迭代优化直至学出所有标签信息。所提算法不仅利用标签信息改进了样本间相似性的度量方式，而且充分利用中间结果降低了半监督学习对标签数据的需求量。在6个真实数据集上的实验结果表明，该算法在超过95%的情况下相较三种传统的基于图的半监督学习算法取得了更高的分类准确率。

关键词: 机器学习, 图半监督学习, 度量学习, 标签传播, 相似度矩阵

CLC Number:

TP181

LYU Yali, MIAO Junzhong, HU Weixin. Semi-supervised learning algorithm of graph based on label metric learning[J]. Journal of Computer Applications, 2020, 40(12): 3430-3436.

吕亚丽, 苗钧重, 胡玮昕. 基于标签进行度量学习的图半监督学习算法[J]. 计算机应用, 2020, 40(12): 3430-3436.

References

[1] LI C G,LIN Z C,ZHANG H G,et al. Learning semi-supervised representation towards a unified optimization framework for semisupervised learning[C]//Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway:IEEE, 2015:2767-2775.
[2] ZHOU Z. A brief introduction to weakly supervised learning[J]. National Science Review,2017,5:44-53.
[3] MEY A, LOOG M. Improvability through semi-supervised learning:a survey of theoretical results[EB/OL].[2020-05-09]. https://arxiv.org/pdf/1908.09574v1.pdf.
[4] 刘建伟, 刘媛, 罗雄麟. 半监督学习方法[J]. 计算机学报, 2015, 38(8):1592-1617.(LIU J W,LIU Y,LUO X L. Semi-supervised learning methods[J]. Chinese Journal of Computers,2015,38(8):1592-1617.)
[5] ZHANG Y,ZHANG X,YUAN X,et al. Large-scale graph-based semi-supervised learning via tree Laplacian solver[C]//Proceedings of the 201630th AAAI Conference on Artificial Intelligence. Palo Alto:AAAI Press,2016:2344-2350.
[6] ZHOU D,BOUSQUET O,LAL T N,et al. Learning with local and global consistency[C]//Proceedings of the 200316th International Conference on Neural Information Processing Systems. Cambridge:MIT Press,2003:321-328.
[7] NIE F,SHI S,LI X. Semi-supervised learning with auto-weighting feature and adaptive graph[J]. IEEE Transactions on Knowledge and Data Engineering,2020,32(6):1167-1178.
[8] WANG F, ZHANG C. Label propagation through linear neighborhoods[J]. IEEE Transactions on Knowledge and Data Engineering,2008,20(1):55-67.
[9] BELKIN M,NIYOGI P. Laplacian eigenmaps for dimensionality reduction and data representation[J]. Neural Computation,2003, 15(6):1373-1396.
[10] BELKIN M,NIYOGI P. Semi-supervised learning on Riemannian manifolds[J]. Machine Learning,2004,56(1/2/3):209-239.
[11] SAUL L K, ROWEIS S T. Think globally, fit locally:unsupervised learning of low dimensional manifolds[J]. Journal of Machine Learning Research,2003,4:119-155.
[12] LIU G,LIN Z,YU Y. Robust subspace segmentation by low-rank representation[C]//Proceedings of the 201027th International Conference on Machine Learning. Madison:Omnipress,2010:663-670.
[13] CHENG H,LIU Z,YANG J. Sparsity induced similarity measure for label propagation[C]//Proceedings of the 2009 IEEE 12th International Conference on Computer Vision. Piscataway:IEEE, 2009:317-324.
[14] CHENG B,YANG J C,YAN S C,et al. Learning with l1-graph for image analysis[J]. IEEE Transactions on Image Processing, 2010,19(4):858-866.
[15] HE R,ZHENG W,HU B,et al. Nonnegative sparse coding for discriminative semi-supervised learning[C]//Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2011:2849-2856.
[16] LI H, ZHANG J, HU J, et al. Graph-based discriminative concept factorization for data representation[J]. Knowledge Based Systems,2017,118:70-79.
[17] ZHU X, GHAHRAMANI Z, LAFFERTY J. Semi-supervised learning using Gaussian fields and harmonic functions[C]//Proceedings of the 200320th International Conference on Machine Learning. Palo Alto:AAAI Press,2003:912-919.
[18] NIE F,XIANG S,LIU Y,et al. A general graph-based semisupervised learning with novel class discovery[J]. Neural Computing and Applications,2010,19(4):549-555.
[19] RUSTAMOV R M,KLOSOWSKI J T. Interpretable graph-based semi-supervised learning via flows[C]//Proceedings of the 201832nd AAAI Conference on Artificial Intelligence. Palo Alto:AAAI Press,2018:3976-3983.
[20] LI Q,WU X,GUAN Z C. Generalized label propagation methods for semi-supervised learning[EB/OL].[2020-05-09]. https://arxiv.org/pdf/1901.09993.pdf.
[21] SZUMMER M,JAAKKOLA T. Partially labeled classification with Markov random walks[C]//Proceedings of the 200114th International Conference on Neural Information Processing Systems. Cambridge:MIT Press,2001:945-952.
[22] WEINBERGER K Q,SAUL L K. Distance metric learning for large margin nearest neighbor classification[J]. Journal of Machine Learning Research,2009,10:207-244.
[23] SONG K,NIE F,HAN J,et al. Parameter free large margin nearest neighbor for distance metric learning[C]//Proceedings of the 201731st AAAI Conference on Artificial Intelligence. Palo Alto:AAAI Press,2017:2555-2561.
[24] XING E P,NG A Y,JORDAN M I,et al. Distance metric learning with application to clustering with side-information[C]//Proceedings of the 200215th International Conference on Neural Information Processing Systems. Cambridge:MIT Press,2002:521-528.
[25] VAN DER MAATEN L, HINTON G. Visualizing highdimensional data using t-SNE[J]. Journal of Machine Learning Research,2008,9(2):2579-2605.

Semi-supervised learning algorithm of graph based on label metric learning

基于标签进行度量学习的图半监督学习算法

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics

[1]	GUO Mian, ZHANG Jinyou. Computation offloading policy for machine learning in mobile edge computing environments [J]. Journal of Computer Applications, 2021, 41(9): 2639-2645.
[2]	MAO Mingze, CAO Ruihao, YAN Chungang. Semi-supervised classification algorithm based on weight diversity [J]. Journal of Computer Applications, 2021, 41(9): 2473-2480.
[3]	ZHANG Cheng, WAN Yuan, QIANG Haopeng. Deep unsupervised discrete cross-modal hashing based on knowledge distillation [J]. Journal of Computer Applications, 2021, 41(9): 2523-2531.
[4]	QIN Binbin, PENG Liangkang, LU Xiangming, QIAN Jiangbo. Research progress on driver distracted driving detection [J]. Journal of Computer Applications, 2021, 41(8): 2330-2337.
[5]	QIN Jing, ZUO Changqing, WANG Zumin, JI Changqing, WANG Baofeng. Design of abnormal electrocardiograph monitoring model based on stacking classifier [J]. Journal of Computer Applications, 2021, 41(3): 887-890.
[6]	JIANG Qianyu, WANG Fengying, JIA Lipeng. Malware detection method based on perceptual hash algorithm and feature fusion [J]. Journal of Computer Applications, 2021, 41(3): 780-785.
[7]	MENG Xiangrui, YANG Wenzhong, WANG Ting. Survey of sentiment analysis based on image and text fusion [J]. Journal of Computer Applications, 2021, 41(2): 307-317.
[8]	WANG Yahui, QIAN Yuhua, LIU Guoqing. Ordinal decision tree algorithm based on fuzzy advantage complementary mutual information [J]. Journal of Computer Applications, 2021, 41(10): 2785-2792.
[9]	JIANG Yangsheng, WANG Shengnan, TU Jiaqi, LI Sha, WANG Hongjun. Comprehensive prediction of thermal comfort and energy consumption for high-speed railway stations [J]. Journal of Computer Applications, 2021, 41(1): 249-257.
[10]	WEI Wenyu, YANG Wenzhong, MA Guoxiang, HUANG Mei. Survey of person re-identification technology based on deep learning [J]. Journal of Computer Applications, 2020, 40(9): 2479-2492.
[11]	SHENG Jun, LI Bin, CHEN Ling. Recommendation algorithm based on modularity and label propagation [J]. Journal of Computer Applications, 2020, 40(9): 2606-2612.
[12]	ZHU Lin, YU Haitao, LEI Xinyu, LIU Jing, WANG Ruofan. Brain network feature identification algorithm for Alzheimer's patients based on MRI image [J]. Journal of Computer Applications, 2020, 40(8): 2455-2459.
[13]	LIANG Denggao, ZHOU Anmin, ZHENG Rongfeng, LIU Liang, DING Jianwei. WeChat payment behavior recognition model based on division of large and small burst blocks [J]. Journal of Computer Applications, 2020, 40(7): 1970-1976.
[14]	XU Zhoubo, YANG Jian, LIU Huadong, HUANG Wenwen. Protein complex identification algorithm based on XGboost and topological structural information [J]. Journal of Computer Applications, 2020, 40(5): 1510-1514.
[15]	ZHANG Junsheng, XU Jingjing, YU Wei. No-reference image quality assessment method for facial beautification image [J]. Journal of Computer Applications, 2020, 40(4): 1184-1190.