基于线性分配的难负样本挖掘度量学习

doi:10.11772/j.issn.1001-9081.2019081403

《计算机应用》唯一官方网站 ›› 2020, Vol. 40 ›› Issue (2): 352-357.DOI: 10.11772/j.issn.1001-9081.2019081403

• 2019年全国开放式分布与并行计算学术年会(DPCS 2019)论文 • 上一篇下一篇

基于线性分配的难负样本挖掘度量学习

傅泰铭, 陈燕(), 李陶深

广西大学计算机与电子信息学院，南宁 530004

收稿日期:2019-07-31 修回日期:2019-09-25 接受日期:2019-09-25 发布日期:2019-11-04 出版日期:2020-02-10
通讯作者: 陈燕
作者简介:傅泰铭（1995—），男，广西南宁人，硕士研究生，CCF会员，主要研究方向：智能算法优化、计算机视觉
李陶深（1957—），男，广西邕宁人，CCF会员，教授，博士，主要研究方向：智能系统、智能算法优化。
基金资助:
国家自然科学基金资助项目(61762008);广西重点研发计划项目(AB17195014)

Hard-negative sample mining for metric learning based on linear assignment

Taiming FU, Yan CHEN(), Taoshen LI

College of Computer，Electronics and Information，Guangxi University，Nanning Guangxi 530004，China

Received:2019-07-31 Revised:2019-09-25 Accepted:2019-09-25 Online:2019-11-04 Published:2020-02-10
Contact: Yan CHEN
About author:FU Taiming， born in 1995， M. S. candidate. His research interests include optimization of intelligent algorithm， computer vision.
LI Taoshen， born in 1957， Ph. D.， professor. His research interests include intelligent system， optimization of intelligent algorithm.
Supported by:
the National Natural Science Foundation of China(61762008);the Key Research and Development Program of Guangxi(AB17195014)

摘要/Abstract

摘要：

科学家依靠鲸鱼尾巴的形状及其独特的标记来识别鲸鱼的种类，但靠人眼识别和手工标注的过程非常繁琐。而且鲸鱼尾巴照片数据集存在数据分布不均衡的特点，其中个别种类样本数量极少，甚至仅有一份；同时样本个体差异较小，并且包含未知类别，导致以图像分类的方式完成鲸鱼身份的自动标注存在困难。为解决度量学习在该任务下难以分类的问题，在孪生神经网络（SNN）的基础上，利用线性分配问题（LAP）算法进行难负样本挖掘训练过程从而动态地构筑训练批次。首先对训练样本提取图像特征向量，并计算特征向量的相似性度量；然后通过LAP为模型分配样本对，根据度量分数矩阵动态地构筑训练样本批次，针对性地训练困难样本对。在一个数据分布不平衡的鲸鱼尾巴图像数据集和CUB-200-2001数据集上得到的实验结果表明，所提算法在少数类学习和细粒度图像分类上能取得良好的效果。

关键词: 线性分配, 难负样本挖掘, 度量学习, 细粒度图像识别, 孪生神经网络

Abstract:

Scientists identify the species of whales based on the shape and the distinctive marks of the whale tails， but the process of recognition by human eyes and manual labeling is very cumbersome. The dataset of whale tail photo has the unbalanced data distribution， and some specific categories in the dataset have very few samples or even one sample. Besides， the samples have small individual differences and contain unknown categories， which leads to the difficulty in automatic labeling of whale identification by image classification. To solve the problem that metric learning is difficult to realize classification under this task， on the basis of Siamese Neural Network （SNN）， the training batches were constructed dynamically by using Linear Assignment Problem （LAP） algorithm in the training process of hard-negative sample mining. Firstly， image feature vectors were extracted from the training samples， and the similarity metric of feature vector was calculated. Then， LAP was used to assign sample pairs to the model， training sample batches were constructed dynamically according to the metric score matrix， and the difficult sample pairs were targeted by trained. Experimental results on a whale tail image dataset with unbalanced data distribution and CUB 200-2001 dataset show that， the proposed algorithm can achieve good results in learning minority classes and classifying fine-grained images.

Key words: linear assignment, hard-negative sample mining, metric learning, fine-grained image recognition, Siamese Neural Network (SNN)

中图分类号:

TP391.41

傅泰铭, 陈燕, 李陶深. 基于线性分配的难负样本挖掘度量学习[J]. 计算机应用, 2020, 40(2): 352-357.

Taiming FU, Yan CHEN, Taoshen LI. Hard-negative sample mining for metric learning based on linear assignment[J]. Journal of Computer Applications, 2020, 40(2): 352-357.

图/表 11

参考文献 16

1	HU J， LU J， TAN Y P. Discriminative deep metric learning for face verification in the wild［C］// Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2014： 1875-1882. 10.1109/cvpr.2014.242
2	BROMLEY J， GUYON I， LECUN Y， et al. Signature verification using a "Siamese" time delay neural network［C］// Proceedings of 6th International Conference on Neural Information Processing Systems. San Francisco， CA： Morgan Kaufmann Publishers Inc.， 1993： 737-744. 10.1142/9789812797926_0003
3	ZHANG K， ZHANG Z， LI Z， et al. Joint face detection and alignment using multitask cascaded convolutional networks［J］. IEEE Signal Processing Letters， 2016， 23（10）： 1499-1503. 10.1109/lsp.2016.2603342
4	LI H， LIN Z， SHEN X， et al. A convolutional neural network cascade for face detection［C］// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2015： 5325-5334. 10.1109/cvpr.2015.7299170
5	SCHROFF F， KALENICHENKO D， PHILBIN J. FaceNet： a unified embedding for face recognition and clustering［C］// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2015： 815-823. 10.1109/cvpr.2015.7298682
6	CHOPRA S， HADSELL R， LECUN Y. Learning a similarity metric discriminatively， with application to face verification［C］// Proceedings of the 2005 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2005： 539-546. 10.1109/cvpr.2005.202
7	WU L， WANG Y， GAO J， et al. Where-and-when to look： deep Siamese attention networks for video-based person re-identification［J］. IEEE Transactions on Multimedia， 2019， 21（6）： 1412-1424. 10.1109/tmm.2018.2877886
8	KUMAR B G V， CARNEIRO G， REID I. Learning local image descriptors with deep Siamese and triplet convolutional networks by minimizing global loss functions［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 5385-5394. 10.1109/cvpr.2016.581
9	GE Y， LI Z， ZHAO H， et al. FD-GAN： pose-guided feature distilling GAN for robust person re-identification［C］// Proceedings of the 32nd International Conference on Neural Information Processing Systems. New York： Curran Associates Inc.， 2018： 1222-1233.
10	LI D， PORIKLI F， WEN G， et al. When correlation filters meet Siamese networks for real-time complementary tracking［J］. IEEE Transactions on Circuits and Systems for Video Technology， 2019， 30（2）： 509-519. 10.1109/tcsvt.2019.2892759
11	JIANG C， XIAO J， XIE Y， et al. Siamese network ensemble for visual tracking［J］. Neurocomputing， 2018， 275： 2892-2903. 10.1016/j.neucom.2017.10.043
12	KUAI Y， WEN G， LI D. Hyper-Siamese network for robust visual tracking［J］. Signal， Image and Video Processing， 2019， 13（1）： 35-42. 10.1007/s11760-018-1325-6
13	BARUA S， ISLAM M， MURASE K. A novel synthetic minority oversampling technique for imbalanced data set learning［C］// Proceedings of the 2011 International Conference on Neural Information Processing， LNCS7063. Berlin： Springer， 2011： 735-744.
14	SHRIVASTAVA A， GUPTA A， GIRSHICK R. Training region-based object detectors with online hard example mining［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 761-769. 10.1109/cvpr.2016.89
15	LIN T Y， GOYAL P， GIRSHICK R， et al. Focal loss for dense object detection［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017： 2999-3007. 10.1109/iccv.2017.324
16	KRIZHEVSKY A， SUTSKEVER I， HINTON G E. ImageNet classification with deep convolutional neural networks［C］// Proceedings of the 25th International Conference on Neural Information Processing Systems. New York： Curran Associates Inc.， 2012： 1097-1105.

Layer（type）	Output Shape	Param #	Connected to
input_2（InputLayer）	（None，1 024）	0	\
input_3（InputLayer）	（None，1 024）	0	\
lambda_1（Lambda）	（None，1 024）	0	input_2，input_3
lambda_2（Lambda）	（None，1 024）	0	input_2，input_3
lambda_3（Lambda）	（None，1 024）	0	input_2，input_3
lambda_4（Lambda）	（None，1 024）	0	lambda_7
concatenate_1（Concatenate）	（None，4 096）	0	lambda_1，lambda_2，…，lambda_4
reshape_1 （Reshape）	（None，4，1 024，1）	0	concatenate_1
conv2d_56 （Conv2D）	（None，1，1 024，32）	160	reshape1
reshape_2 （Reshape）	（None，1 024，32，1）	0	conv2d_56
conv2d_57 （Conv2D）	（None，1 024，1，1）	33	reshape_1
flatten （Flatten）	（None，1 024）	0	conv2d_57
weighted-average （Dense）	（None，1）	1 025	flatten

Layer（type）	Output Shape	Param #	Connected to
input_2（InputLayer）	（None，1 024）	0	\
input_3（InputLayer）	（None，1 024）	0	\
lambda_1（Lambda）	（None，1 024）	0	input_2，input_3
lambda_2（Lambda）	（None，1 024）	0	input_2，input_3
lambda_3（Lambda）	（None，1 024）	0	input_2，input_3
lambda_4（Lambda）	（None，1 024）	0	lambda_7
concatenate_1（Concatenate）	（None，4 096）	0	lambda_1，lambda_2，…，lambda_4
reshape_1 （Reshape）	（None，4，1 024，1）	0	concatenate_1
conv2d_56 （Conv2D）	（None，1，1 024，32）	160	reshape1
reshape_2 （Reshape）	（None，1 024，32，1）	0	conv2d_56
conv2d_57 （Conv2D）	（None，1 024，1，1）	33	reshape_1
flatten （Flatten）	（None，1 024）	0	conv2d_57
weighted-average （Dense）	（None，1）	1 025	flatten

采样策略	mAP@5
SNN	0.851
SNN with OHEM	0.894
SNN with FL	0.921
SNN with LAP	0.946

采样策略	mAP@5
SNN	0.851
SNN with OHEM	0.894
SNN with FL	0.921
SNN with LAP	0.946

分支模型编码器	mAP@5
ResNet18	0.850
ResNet50	0.906
SE-ResNet50	0.932
DenseNet121	0.946

基于线性分配的难负样本挖掘度量学习

Hard-negative sample mining for metric learning based on linear assignment

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 11

参考文献 16

相关文章 13

编辑推荐

Metrics

采样策略	Recall@
采样策略	1	2	4	8	16	32
随机采样	50.2	61.9	70.6	81.4	87.0	91.3
难例挖掘	52.6	64.9	72.8	84.7	89.9	92.6
线性分配	54.4	67.2	76.9	85.6	92.1	93.5

[1]	黄颖, 杨佳宇, 金家昊, 万邦睿. 用于RGBT跟踪的孪生混合信息融合算法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2878-2885.
[2]	蔡美玉, 朱润哲, 吴飞, 张开昱, 李家乐. 基于注意力机制和多粒度特征融合的跨视角匹配模型[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 901-908.
[3]	柴汶泽, 范菁, 孙书魁, 梁一鸣, 刘竟锋. 深度度量学习综述[J]. 《计算机应用》唯一官方网站, 2024, 44(10): 2995-3010.
[4]	齐爱玲, 王宣淋. 基于中层细微特征提取与多尺度特征融合细粒度图像识别[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2556-2563.
[5]	杜芸彦, 李鸿, 杨锦辉, 江彧, 毛耀. 基于负边距损失的小样本目标检测[J]. 《计算机应用》唯一官方网站, 2022, 42(11): 3617-3624.
[6]	周金坤, 王先兰, 穆楠, 王晨. 基于多视角多监督网络的无人机图像定位方法[J]. 《计算机应用》唯一官方网站, 2022, 42(10): 3191-3199.
[7]	魏文钰, 杨文忠, 马国祥, 黄梅. 基于深度学习的行人再识别技术研究综述[J]. 《计算机应用》唯一官方网站, 2020, 40(9): 2479-2492.
[8]	吕亚丽, 苗钧重, 胡玮昕. 基于标签进行度量学习的图半监督学习算法[J]. 计算机应用, 2020, 40(12): 3430-3436.
[9]	姜逸凡, 叶青. 基于孪生神经网络的时间序列相似性度量[J]. 计算机应用, 2019, 39(4): 1041-1045.
[10]	王林, 张鹤鹤. Faster R-CNN模型在车辆检测中的应用[J]. 计算机应用, 2018, 38(3): 666-670.
[11]	黎万英, 黄瑞章, 丁志远, 陈艳平, 徐立洋. 基于用户行为特征的多维度文本聚类[J]. 计算机应用, 2018, 38(11): 3127-3131.
[12]	徐昕, 梁久祯. 基于三维矫正和相似性学习的无约束人脸验证[J]. 计算机应用, 2018, 38(10): 2788-2793.
[13]	张耿宁, 王家宝, 李阳, 苗壮, 张亚非, 李航. 基于特征融合与核局部Fisher判别分析的行人重识别[J]. 计算机应用, 2016, 36(9): 2597-2600.