Meta label correction method based on shallow network predictions

doi:10.11772/j.issn.1001-9081.2023111616

Journal of Computer Applications ›› 2024, Vol. 44 ›› Issue (11): 3364-3370.DOI: 10.11772/j.issn.1001-9081.2023111616

• Artificial intelligence • Previous Articles Next Articles

Meta label correction method based on shallow network predictions

Yuxin HUANG¹, Yiwang HUANG¹^,²(), Hui HUANG³

^1.School of Computer Science and Mathematics，Fujian University of Technology，Fuzhou Fujian 350118，China
^2.School of Data Science，Tongren University，Tongren Guizhou 554300，China
^3.Department of Modern Agricultural Technology，Fujian Vocational College of Agriculture，Fuzhou Fujian 350119，China

Received:2023-11-22 Revised:2024-02-22 Accepted:2024-03-08 Online:2024-03-12 Published:2024-11-10
Contact: Yiwang HUANG
About author:HUANG Yuxin， born in 1998， M. S. candidate. His research interests include noisy label learning， knowledge distillation.
HUANG Hui， born in 1992， M. S.， teaching assistant. His research interests include machine learning， automated monitoring and control of basin water quality.
Supported by:
National Natural Science Foundation of China(62066040);Project of Tongren Munipal Science and Technology Bureau （Tongren City Scientific Research ［2022］5）

基于浅层网络预测的元标签校正方法

黄雨鑫¹, 黄贻望¹^,²(), 黄辉³

^1.福建理工大学计算机科学与数学学院，福州 350118
^2.铜仁学院大数据学院，贵州铜仁 554300
^3.福建农业职业技术学院现代农业技术学院，福州 350119

通讯作者: 黄贻望
作者简介:黄雨鑫（1998—），男，福建永泰人，硕士研究生，CCF会员，主要研究方向：噪声标签学习、知识蒸馏
黄辉（1992—），男，福建永泰人，助教，硕士，主要研究方向：机器学习、流域水质自动化监测与控制。
基金资助:
国家自然科学基金资助项目(62066040);铜仁市科技局资助项目(铜仁市科研［2022］5号)

Abstract

Abstract:

Aiming at overfitting problem caused by memory behavior of Deep Neural Networks （DNNs） on image data with noisy labels， a meta label correction method based on predictions from shallow neural networks was proposed. In this method， with the use of weakly supervised training method， a label reweighting network was set to reweight noise data， meta learning method was employed to facilitate dynamic learning of the model to noise data， and the prediction output from both deep and shallow networks was used as the pseudo labels to train the model. At the same time， the knowledge distillation algorithm was applied to allow the deep network to guide the training of the shallow networks. In this way， the memory behavior of the model was alleviated effectively and the robustness of the model was enhanced. Experiments conducted on CIFAR10/100 and Clothing1M datasets demonstrate the superiority of the proposed method over Meta Label Correction （MLC） method. Particularly， on CIFAR10 dataset with symmetrical noise ratios of 60% and 80%， the accuracy improvements are 3.49 and 1.56 percentage points respectively. Furthermore， in ablation experiments on CIFAR100 dataset with asymmetric noise ratio of 40%， at most 5.32 percentage points accuracy improvement is achieved by the proposed method over models trained without predicted labels， confirming the feasibility and effectiveness of the proposed method.

Key words: noisy label, meta learning, label correction, label reweighting, knowledge distillation

摘要：

针对深度神经网络（DNN）对含有噪声标签的图像数据具有记忆行为而导致的过拟合问题，提出一种基于浅层神经网络预测的元标签校正方法。该方法采用弱监督训练方式，通过设置标签重加权网络对噪声数据进行加权操作，利用元学习方法使模型动态地学习噪声数据，并将模型中深层与浅层网络的预测输出作为伪标签训练模型，同时利用知识蒸馏算法使深层网络指导浅层网络训练，以有效缓解模型的记忆行为并提升模型鲁棒性。在CIFAR10/100、Clothing1M数据集上的实验结果表明，相较于元标签校正（MLC）方法，所提方法在对称噪声比例为60%与80%的CIFAR10数据集上的准确率分别提升了3.49、1.56个百分点；此外，在CIFAR100数据集的消融实验中，非对称噪声比例为40%时，所提方法比无预测标签训练的模型准确率最高提升了5.32个百分点，验证了所提方法的可行性与有效性。

关键词: 噪声标签, 元学习, 标签校正, 标签重加权, 知识蒸馏

CLC Number:

TP181

Yuxin HUANG, Yiwang HUANG, Hui HUANG. Meta label correction method based on shallow network predictions[J]. Journal of Computer Applications, 2024, 44(11): 3364-3370.

黄雨鑫, 黄贻望, 黄辉. 基于浅层网络预测的元标签校正方法[J]. 《计算机应用》唯一官方网站, 2024, 44(11): 3364-3370.

Figures/Tables 9

References 24

1	SAMBASIVAN N， KAPANIA S， HIGHFILL H， et al. “Everyone wants to do the model work， not the data work”： data cascades in high-stakes AI［C］// Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. New York： ACM， 2021： No.39.
2	ZHANG H， CISSE M， DAUPHIN Y N， et al. mixup： beyond empirical risk minimization［EB/OL］. ［2023-10-30］..
3	SRIVASTAVA N， HINTON G， KRIZHEVSKY A， et al. Dropout： a simple way to prevent neural networks from overfitting［J］. Journal of Machine Learning Research， 2014， 15： 1929-1958.
4	张增辉，姜高霞，王文剑.基于动态概率抽样的标签噪声过滤方法［J］.计算机应用，2021，41（12）：3485-3491.
	ZHANG Z H， JIANG G X， WANG W J. Label noise filtering method based on dynamic probability sampling［J］. Journal of Computer Applications， 2021， 41（12）： 3485-3491.
5	魏翔，王靖杰，张顺利，等.ReLSL：基于可靠标签选择与学习的半监督学习算法［J］.计算机学报，2022，45（6）：1147-1160.
	WEI X， WANG J J， ZHANG S L， et al. ReLSL： reliable label selection and learning based algorithm for semi-supervised learning［J］. Chinese Journal of Computers， 2022， 45（6）： 1147-1160.
6	ZHANG Y， ZHENG S， WU P， et al. Learning with feature-dependent label noise： a progressive approach［EB/OL］. ［2023-09-05］. .
7	余游，冯林，王格格，等.一种基于伪标签的半监督少样本学习模型［J］.电子学报，2019，47（11）：2284-2291.
	YU Y， FENG L， WANG G G， et al. A few-shot learning model based on semi-supervised with pseudo label［J］. Acta Electronica Sinica， 2019， 47（11）： 2284-2291.
8	FINN C， ABBEEL P， LEVINE S. Model-agnostic meta-learning for fast adaptation of deep networks［C］// Proceedings of the 34th International Conference on Machine Learning. New York： JMLR.org， 2017： 1126-1135.
9	伏博毅，彭云聪，蓝鑫，等. 基于深度学习的标签噪声学习算法综述［J］. 计算机应用， 2023， 43（3）： 674-684.
	FU B Y， PENG Y C， LAN X， et al. Survey of label noise learning algorithms based on deep learning［J］. Journal of Computer Applications， 2023， 43（3）： 674-684.
10	PATRINI G， ROZZA A， MENON A K， et al. Making deep neural networks robust to label noise： a loss correction approach［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 2233-2241.
11	HAN B， YAO Q， YU X， et al. Co-teaching： robust training of deep neural networks with extremely noisy labels［C］// Proceedings of the 32nd International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2018： 8536-8546.
12	SUKHBAATAR S， FERGUS R. Learning from noisy labels with deep neural networks ［EB/OL］. ［2023-12-11］. .
13	HENDRYCKS D， MAZEIKA M， WILSON D， et al. Using trusted data to train deep networks on labels corrupted by severe noise［C］// Proceedings of the 32nd International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2018： 10477-10486.
14	LI Y， YANG J， SONG Y， et al. Learning from noisy labels with distillation［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017： 1928-1936.
15	SHU J， XIE Q， YI L， et al. Meta-weight-net： learning an explicit mapping for sample weighting［C］// Proceedings of the 33rd International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2019： 1919-1930.
16	ZHENG G， AWADALLAH A H， DUMAIS S. Meta label correction for noisy label learning［C］// Proceedings of the 35th AAAI Conference on Artificial Intelligence. Palo Alto， CA： AAAI Press， 2021： 11053-11061.
17	HE K， ZHANG X， REN S， et al. Deep residual learning for image recognition［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 770-778.
18	ZHANG C， BENGIO S， HARDT M， et al. Understanding deep learning （still） requires rethinking generalization［J］. Communications of the ACM， 2021， 64（3）： 107-115.
19	LIU S， NILES-WEED J， RAZAVIAN N， et al. Early-learning regularization prevents memorization of noisy labels［C］// Proceedings of the 34th International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2020： 20331-20342.
20	HINTON G， VINVALS O， DEAN J. Distilling the knowledge in a neural network［EB/OL］. ［2024-01-08］. .
21	XIAO T， XIA T， YANG Y， et al. Learning from massive noisy labeled data for image classification［C］// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2015： 2691-2699.
22	ZHANG Z， SABUNCU M R. Generalized cross entropy loss for training deep neural networks with noisy labels［C］// Proceedings of the 32nd International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2018： 8792-8802.
23	WU Y， SHU J， XIE Q， et al. Learning to purify noisy labels via meta soft label corrector［C］// Proceedings of the 35th AAAI Conference on Artificial Intelligence. Palo Alto， CA： AAAI Press， 2021： 10388-10396.
24	REED S E， LEE H， ANGUELOV D， et al. Training deep neural networks on noisy labels with bootstrapping［EB/OL］. ［2023-11-30］..

噪声类型	噪声比例	CIFAR10								CIFAR100
噪声类型	噪声比例	CE	GCE	GLC	ELR	MW-Net	MLC	MSLC	Ours	CE	GCE	GLC	ELR	MW-Net	MLC	MSLC	Ours
对称噪声	20	86.98	90.27	91.43	91.16	91.48	89.83	93.46	92.96	58.72	71.36	69.30	74.21	69.79	58.42	72.51	73.85
	40	81.88	88.50	88.52	89.15	87.34	87.32	91.42	91.06	48.20	63.39	63.24	68.28	65.44	44.92	68.98	69.74
	60	74.14	83.70	84.08	86.12	81.98	83.92	87.39	87.41	37.41	58.06	56.12	59.28	55.42	28.74	60.81	61.74
	80	53.82	57.27	64.21	73.86	65.88	74.73	69.87	76.29	18.10	16.51	18.59	29.78	19.62	19.32	24.32	31.60
非对称噪声	20	86.23	90.11	92.46	91.39	93.44	91.81	94.39	93.60	57.91	69.56	71.40	—	67.54	60.19	72.66	75.47
非对称噪声	40	80.11	85.24	91.74	90.12	91.64	91.35	92.81	91.08	42.74	57.05	67.73	73.26	60.24	55.69	70.51	71.63

噪声类型	噪声比例	CIFAR10								CIFAR100
噪声类型	噪声比例	CE	GCE	GLC	ELR	MW-Net	MLC	MSLC	Ours	CE	GCE	GLC	ELR	MW-Net	MLC	MSLC	Ours
对称噪声	20	86.98	90.27	91.43	91.16	91.48	89.83	93.46	92.96	58.72	71.36	69.30	74.21	69.79	58.42	72.51	73.85
	40	81.88	88.50	88.52	89.15	87.34	87.32	91.42	91.06	48.20	63.39	63.24	68.28	65.44	44.92	68.98	69.74
	60	74.14	83.70	84.08	86.12	81.98	83.92	87.39	87.41	37.41	58.06	56.12	59.28	55.42	28.74	60.81	61.74
	80	53.82	57.27	64.21	73.86	65.88	74.73	69.87	76.29	18.10	16.51	18.59	29.78	19.62	19.32	24.32	31.60
非对称噪声	20	86.23	90.11	92.46	91.39	93.44	91.81	94.39	93.60	57.91	69.56	71.40	—	67.54	60.19	72.66	75.47
非对称噪声	40	80.11	85.24	91.74	90.12	91.64	91.35	92.81	91.08	42.74	57.05	67.73	73.26	60.24	55.69	70.51	71.63

方法	准确率	方法	准确率
CE	68.94	MLC	—
MW-Net	73.72	MSLC	74.02
GLC	73.69	Bootstrap	69.12
ELR	72.87	Ours	73.81

方法	准确率	方法	准确率
CE	68.94	MLC	—
MW-Net	73.72	MSLC	74.02
GLC	73.69	Bootstrap	69.12
ELR	72.87	Ours	73.81

噪声类型	噪声比例	准确率
		预测标签		无预测标签
		Best	Last	Best	Last
对称噪声	20	73.85	73.43	73.19	64.86
	40	69.74	69.13	68.45	48.52
	60	61.74	61.61	60.86	32.27
	80	31.60	20.93	29.21	12.45
非对称噪声	20	75.74	74.95	75.33	63.95
非对称噪声	40	71.63	70.73	66.31	45.85

Meta label correction method based on shallow network predictions

基于浅层网络预测的元标签校正方法

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 9

References 24

Related Articles 15

Recommended Articles

Metrics

数据集	类别数	训练集样本数/10³		测试集样本数/10³
数据集	类别数	干净数据集	噪声数据集	测试集样本数/10³
CIFAR10	10	1	49	10
CIFAR100	100	1	49	10
Clothing1M	14	50	1 000	10

[1]	Yunchuan HUANG, Yongquan JIANG, Juntao HUANG, Yan YANG. Molecular toxicity prediction based on meta graph isomorphism network [J]. Journal of Computer Applications, 2024, 44(9): 2964-2969.
[2]	Jieru JIA, Jianchao YANG, Shuorui ZHANG, Tao YAN, Bin CHEN. Unsupervised person re-identification based on self-distilled vision Transformer [J]. Journal of Computer Applications, 2024, 44(9): 2893-2902.
[3]	Yubo ZHAO, Liping ZHANG, Sheng YAN, Min HOU, Mao GAO. Relation extraction between discipline knowledge entities based on improved piecewise convolutional neural network and knowledge distillation [J]. Journal of Computer Applications, 2024, 44(8): 2421-2429.
[4]	Boshi ZOU, Ming YANG, Chenchen ZONG, Mingkun XIE, Shengjun HUANG. Robust learning method by reweighting examples with negative learning [J]. Journal of Computer Applications, 2024, 44(5): 1479-1484.
[5]	Wangjun SHI, Jing WANG, Xiaojun NING, Youfang LIN. Sleep stage classification model by meta transfer learning in few-shot scenarios [J]. Journal of Computer Applications, 2024, 44(5): 1445-1451.
[6]	Xue LI, Guangle YAO, Honghui WANG, Jun LI, Haoran ZHOU, Shaoze YE. Remote sensing image classification based on sample incremental learning [J]. Journal of Computer Applications, 2024, 44(3): 732-736.
[7]	Xujian ZHAO, Hanglin LI. Deep neural network compression algorithm based on hybrid mechanism [J]. Journal of Computer Applications, 2023, 43(9): 2686-2691.
[8]	Zhangjian JI, Ming ZHANG, Zilong WANG. High-precision object detection algorithm based on improved VarifocalNet [J]. Journal of Computer Applications, 2023, 43(7): 2147-2154.
[9]	Chunhao CAI, Jianliang LI. Model distillation model based on training weak teacher networks about few-shots [J]. Journal of Computer Applications, 2022, 42(9): 2652-2658.
[10]	Huaiqing HE, Jianqing YAN, Kanghua HUI. Lightweight face recognition method based on deep residual network [J]. Journal of Computer Applications, 2022, 42(7): 2030-2036.
[11]	Wei REN, Hexiang BAI. Multi-label image classification method based on global and local label relationship [J]. Journal of Computer Applications, 2022, 42(5): 1383-1390.
[12]	Junhua GU, Shuai FAN, Ningning LI, Suqi ZHANG. Long- and short-term recommendation model and updating method based on knowledge graph preference attention network [J]. Journal of Computer Applications, 2022, 42(4): 1079-1086.
[13]	Renjie XU, Baodi LIU, Kai ZHANG, Weifeng LIU. Model agnostic meta learning algorithm based on Bayesian weight function [J]. Journal of Computer Applications, 2022, 42(3): 708-712.
[14]	ZHANG Cheng, WAN Yuan, QIANG Haopeng. Deep unsupervised discrete cross-modal hashing based on knowledge distillation [J]. Journal of Computer Applications, 2021, 41(9): 2523-2531.
[15]	HUANG Jishuang, ZHANG Hua, LI Yonglong, ZHAO Hao, WANG Haoran, FENG Chuncheng. Hydraulic tunnel defect recognition method based on dynamic feature distillation [J]. Journal of Computer Applications, 2021, 41(8): 2358-2365.