优化三元组损失的深度距离度量学习方法

doi:10.11772/j.issn.1001-9081.2021061107

《计算机应用》唯一官方网站 ›› 2021, Vol. 41 ›› Issue (12): 3480-3484.DOI: 10.11772/j.issn.1001-9081.2021061107

• 第十八届中国机器学习会议(CCML 2021) • 上一篇

优化三元组损失的深度距离度量学习方法

李子龙¹^,²(), 周勇², 鲍蓉¹, 王洪栋¹

^1.徐州工程学院信息工程学院，江苏徐州 221018
^2.中国矿业大学计算机科学与技术学院，江苏徐州 221116

收稿日期:2021-05-12 修回日期:2021-07-26 接受日期:2021-08-05 发布日期:2021-12-28 出版日期:2021-12-10
通讯作者: 李子龙
作者简介:周勇（1974—），男，江苏徐州人，教授，博士，CCF会员，主要研究方向：深度学习、计算机视觉
鲍蓉（1968—），女，上海人，教授，博士，CCF会员，主要研究方向：深度学习、信息处理
王洪栋（1986—），男，山东临沂人，讲师，博士，主要研究方向：图像处理、机器学习。
基金资助:
国家自然科学基金资助项目(61806206);江苏省建设系统科技项目(2018ZD077);徐州工程学院校级科研项目(XKY2019107);江苏省高校自然科学研究项目(20KJB170023)

Deep distance metric learning method based on optimized triplet loss

Zilong LI¹^,²(), Yong ZHOU², Rong BAO¹, Hongdong WANG¹

^1.School of Information Engineering，Xuzhou University of Technology，Xuzhou Jiangsu 221018 China
^2.School of Computer Science and Technology，China University of Mining and Technology，Xuzhou Jiangsu 221116，China

Received:2021-05-12 Revised:2021-07-26 Accepted:2021-08-05 Online:2021-12-28 Published:2021-12-10
Contact: Zilong LI
About author:ZHOU Yong， born in 1974， Ph. D.， professor. His research interests include deep learning， computer vision.
BAO Rong， born in 1968， Ph. D.， professor. Her research interests include deep learning， information processing.
WANG Hongdong， born in 1986， Ph. D.， lecturer. His research interests include image processing， machine learning.
Supported by:
the National Natural Science Foundation of China(61806206);the Technology Project of Jiangsu Province Construction System(2018ZD077);the Scientific Research Project of Xuzhou University of Technology(XKY2019107);the Natural Science Research Project of Jiangsu Higher Education Institutions(20KJB170023)

摘要/Abstract

摘要：

针对基于三元组损失的单一深度距离度量在多样化数据集环境下适应性差，且容易造成过拟合的问题，提出了一种优化三元组损失的深度距离度量学习方法。首先，对经过神经网络映射的三元组训练样本的相对距离进行阈值化处理，并使用线性分段函数作为相对距离的评价函数；然后，将评价函数作为一个弱分类器加入到Boosting算法中生成一个强分类器；最后，采用交替优化的方法来学习弱分类器和神经网络的参数。通过在图像检索任务中对各种深度距离度量学习方法进行评估，可以看到所提方法在CUB-200-2011、Cars-196和SOP数据集上的Recall@1值比之前最好的成绩分别提高了4.2、3.2和0.6。实验结果表明，所提方法的性能优于对比方法，同时在一定程度上避免了过拟合。

关键词: 深度距离度量, 深度学习, 三元组损失, 卷积神经网络, Boosting

Abstract:

Focused on the issues that the single deep distance metric based on triplet loss has poor adaptability to the diversified datasets and easily leads to overfitting， a deep distance metric learning method based on optimized triplet loss was proposed. Firstly， by thresholding the relative distance of triplet training samples mapped by neural network， and a piecewise linear function was used as the evaluation function of relative distance. Secondly， the evaluation function was added to the Boosting algorithm as a weak classifier to generate a strong classifier. Finally， an alternating optimization method was used to learn the parameters of the weak classifier and neural network. Through the evaluation of various deep distance metric learning methods in the image retrieval task， it can be seen that the Recall@1 of the proposed method is 4.2， 3.2 and 0.6 higher than that of the previous best score on CUB-200-2011， Cars-196 and SOP datasets respectively. Experimental results show that the proposed method outperforms the comparison methods， while avoiding overfitting to a certain extent.

Key words: deep distance metric, deep learning, triplet loss, Convolutional Neural Network (CNN), Boosting

中图分类号:

TP391

李子龙, 周勇, 鲍蓉, 王洪栋. 优化三元组损失的深度距离度量学习方法[J]. 计算机应用, 2021, 41(12): 3480-3484.

Zilong LI, Yong ZHOU, Rong BAO, Hongdong WANG. Deep distance metric learning method based on optimized triplet loss[J]. Journal of Computer Applications, 2021, 41(12): 3480-3484.

图/表 5

参考文献 21

1	刘冰，李瑞麟，封举富. 深度度量学习综述［J］. 智能系统学报， 2019， 14（6）：1064-1072. 10.11992/tis.201906045
	LIU B， LI R L， FENG J F. A brief introduction to deep metric learning［J］. CAAI Transactions on Intelligent Systems， 2019， 14（6）：1064-1072. 10.11992/tis.201906045
2	HADSELL R， CHOPRA S， LeCUN Y. Dimensionality reduction by learning an invariant mapping［C］// Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2006： 1735-1742. 10.1109/cvpr.2006.100
3	SONG H O， XIANG Y， JEGELKA S， et al. Deep metric learning via lifted structured feature embedding［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 4004-4012. 10.1109/cvpr.2016.434
4	SUN Y F， CHENG C M， ZHANG Y H， et al. Circle loss： a unified perspective of pair similarity optimization［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 6397-6406. 10.1109/cvpr42600.2020.00643
5	SCHROFF F， KALENICHENKO D， PHILBIN J. FaceNet： a unified embedding for face recognition and clustering［C］// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2015： 815-823. 10.1109/cvpr.2015.7298682
6	MOVSHOVITZ-ATTIAS Y， TOSHEV A， LEUNG T K， et al. No fuss distance metric learning using proxies［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017： 360-368. 10.1109/iccv.2017.47
7	QIAN Q， TANG J S， LI H， et al. Large-scale distance metric learning with uncertainty［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 8542-8550. 10.1109/cvpr.2018.00891
8	GE W F， HUANG W L， DONG D K， et al. Deep metric learning with hierarchical triplet loss［C］// Proceedings of the 2018 European Conference on Computer Vision， LNCS11210. Cham： Springer， 2018： 272-288.
9	WU C Y， MANMATHA R， SMOLA A J， et al. Sampling matters in deep embedding learning［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017： 2859-2867. 10.1109/iccv.2017.309
10	SONG O H， JEGELKA S， RATHOD V， et al. Deep metric learning via facility location［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 2206-2214. 10.1109/cvpr.2017.237
11	SOHN K. Improved deep metric learning with multi-class N-pair loss objective［C］// Proceedings of the 30th International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2016： 1857-1865.
12	WANG J， ZHOU F， WEN S L， et al. Deep metric learning with angular loss［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017： 2612-2620. 10.1109/iccv.2017.283
13	KOZAKAYA T， ITO S， KUBOTA S. Random ensemble metrics for object recognition［C］// Proceedings of the 2011 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2011： 1959-1966. 10.1109/iccv.2011.6126466
14	NEGREL R， LECHERVY A， JURIE F. Boosted metric learning for efficient identity-based face retrieval［C］// Proceedings of the 2015 British Machine Vision Conference. Durham： BMVA Press， 2015： No.139. 10.5244/c.29.139
15	KIM W， GOYAL B， CHAWLA K， et al. Attention-based ensemble for deep metric learning［C］// Proceedings of the 2018 European Conference on Computer Vision， LNCS11205. Cham： Springer， 2018： 760-777.
16	XUAN H， SOUVENIR R， PLESS R. Deep randomized ensembles for metric learning［C］// Proceedings of the 2018 European Conference on Computer Vision， LNCS11220. Cham： Springer， 2018： 751-762.
17	FRIEDMAN J H. Greedy function approximation： a gradient boosting machine［J］. The Annals of Statistics， 2001， 29（5）：1189-1232. 10.1214/aos/1013203451
18	ZHANG Z M， STURGESS P， SENGUPTA S， et al. Efficient discriminative learning of parametric nearest neighbor classifiers［C］// Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2012： 2232-2239. 10.1109/cvpr.2012.6247932
19	WANG X， HAN X T， HUANG W L， et al. Multi-similarity loss with general pair weighting for deep metric learning［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2019： 5017-5025. 10.1109/cvpr.2019.00516
20	YUAN Y H， YANG K Y， ZHANG C. Hard-aware deeply cascaded embedding［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017： 814-823. 10.1109/iccv.2017.94
21	QIAN Q， SHANG L， SUN B G， et al. SoftTriple loss： deep metric learning without triplet sampling［C］// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2019： 6449-6457. 10.1109/iccv.2019.00655

方法	CUB-200-2011						Cars-196						SOP
方法	K=1	K=2	K=4	K=8	K=16	K=32	K=1	K=2	K=4	K=8	K=16	K=32	K=1	K=10	K=100	K=1 000
文献［2］方法	26.4	37.7	49.8	62.3	76.4	85.3	21.7	32.3	46.1	58.9	72.2	83.4	42.0	58.2	73.8	89.1
文献［5］方法	36.1	48.6	59.3	70.0	80.2	88.4	39.1	50.4	63.3	74.5	84.1	89.8	42.1	63.5	82.5	94.8
文献［3］方法	47.2	58.9	70.2	80.2	89.3	93.2	49.0	60.3	72.1	81.5	89.2	92.8	62.1	79.8	91.3	97.4
文献［11］方法	51.0	63.3	74.3	83.2	—	—	71.1	79.7	86.5	91.6	—	—	67.7	83.8	93.0	97.8
文献［12］方法	54.7	66.3	76.0	83.9	—	—	71.4	81.4	87.5	92.1	—	—	70.9	85.0	93.5	98.0
文献［20］方法	53.6	65.7	77.0	85.6	91.5	95.5	73.7	83.2	89.5	93.8	96.7	98.4	69.5	84.4	92.8	97.7
文献［8］方法	57.1	68.8	78.7	86.5	92.5	95.5	81.4	88.0	92.7	95.7	97.4	99.0	74.8	88.3	94.8	98.4
文献［15］方法	60.6	71.5	79.8	87.4	—	—	85.2	90.5	94.0	96.1	—	—	76.3	88.4	94.8	98.2
文献［16］方法	63.9	75.0	83.1	89.7	—	—	86.0	91.7	95.0	97.2	—	—	—	—	—	—
文献［19］方法	65.7	77.0	86.3	91.2	95.0	97.3	84.1	90.4	94.0	96.5	98.0	98.9	78.2	90.5	96.0	98.7
文献［21］方法	65.4	76.4	84.5	90.4	—	—	84.5	90.7	94.5	96.9	—	—	78.3	90.3	95.9	—
本文方法	69.9	80.7	87.6	93.7	97.8	98.9	89.2	94.7	97.4	98.7	99.4	99.7	78.9	91.9	97.1	99.4

方法	CUB-200-2011						Cars-196						SOP
方法	K=1	K=2	K=4	K=8	K=16	K=32	K=1	K=2	K=4	K=8	K=16	K=32	K=1	K=10	K=100	K=1 000
文献［2］方法	26.4	37.7	49.8	62.3	76.4	85.3	21.7	32.3	46.1	58.9	72.2	83.4	42.0	58.2	73.8	89.1
文献［5］方法	36.1	48.6	59.3	70.0	80.2	88.4	39.1	50.4	63.3	74.5	84.1	89.8	42.1	63.5	82.5	94.8
文献［3］方法	47.2	58.9	70.2	80.2	89.3	93.2	49.0	60.3	72.1	81.5	89.2	92.8	62.1	79.8	91.3	97.4
文献［11］方法	51.0	63.3	74.3	83.2	—	—	71.1	79.7	86.5	91.6	—	—	67.7	83.8	93.0	97.8
文献［12］方法	54.7	66.3	76.0	83.9	—	—	71.4	81.4	87.5	92.1	—	—	70.9	85.0	93.5	98.0
文献［20］方法	53.6	65.7	77.0	85.6	91.5	95.5	73.7	83.2	89.5	93.8	96.7	98.4	69.5	84.4	92.8	97.7
文献［8］方法	57.1	68.8	78.7	86.5	92.5	95.5	81.4	88.0	92.7	95.7	97.4	99.0	74.8	88.3	94.8	98.4
文献［15］方法	60.6	71.5	79.8	87.4	—	—	85.2	90.5	94.0	96.1	—	—	76.3	88.4	94.8	98.2
文献［16］方法	63.9	75.0	83.1	89.7	—	—	86.0	91.7	95.0	97.2	—	—	—	—	—	—
文献［19］方法	65.7	77.0	86.3	91.2	95.0	97.3	84.1	90.4	94.0	96.5	98.0	98.9	78.2	90.5	96.0	98.7
文献［21］方法	65.4	76.4	84.5	90.4	—	—	84.5	90.7	94.5	96.9	—	—	78.3	90.3	95.9	—
本文方法	69.9	80.7	87.6	93.7	97.8	98.9	89.2	94.7	97.4	98.7	99.4	99.7	78.9	91.9	97.1	99.4

[1]	宋中山, 梁家锐, 郑禄, 刘振宇, 帖军. 基于双向门控尺度特征融合的遥感场景分类[J]. 计算机应用, 2021, 41(9): 2726-2735.
[2]	李康康, 张静. 基于注意力机制的多层次编码和解码的图像描述模型[J]. 计算机应用, 2021, 41(9): 2504-2509.
[3]	张永斌, 常文欣, 孙连山, 张航. 基于字典的域名生成算法生成域名的检测方法[J]. 计算机应用, 2021, 41(9): 2609-2614.
[4]	赵宏, 孔东一. 图像特征注意力与自适应注意力融合的图像内容中文描述[J]. 计算机应用, 2021, 41(9): 2496-2503.
[5]	徐江浪, 李林燕, 万新军, 胡伏原. 结合目标检测的室内场景识别方法[J]. 计算机应用, 2021, 41(9): 2720-2725.
[6]	牟长宁, 王海鹏, 周丕宇, 侯鑫行. 基于图卷积神经网络的串联质谱从头测序[J]. 计算机应用, 2021, 41(9): 2773-2779.
[7]	谢德峰, 吉建民. 融入句法感知表示进行句法增强的语义解析[J]. 计算机应用, 2021, 41(9): 2489-2495.
[8]	代雨柔, 杨庆, 张凤荔, 周帆. 基于自监督学习的社交网络用户轨迹预测模型[J]. 计算机应用, 2021, 41(9): 2545-2551.
[9]	陈成瑞, 孙宁, 何世彪, 廖勇. 面向C-V2X通信的基于深度学习的联合信道估计与均衡算法[J]. 计算机应用, 2021, 41(9): 2687-2693.
[10]	王贺兵, 张春梅. 基于非对称卷积-压缩激发-次代残差网络的人脸关键点检测[J]. 计算机应用, 2021, 41(9): 2741-2747.
[11]	郑志强, 胡鑫, 翁智, 王雨禾, 程曦. 基于改进DenseNet的牛眼图像特征提取方法[J]. 计算机应用, 2021, 41(9): 2780-2784.
[12]	曹玉红, 徐海, 刘荪傲, 王紫霄, 李宏亮. 基于深度学习的医学影像分割研究综述[J]. 《计算机应用》唯一官方网站, 2021, 41(8): 2273-2287.
[13]	秦斌斌, 彭良康, 卢向明, 钱江波. 司机分心驾驶检测研究进展[J]. 计算机应用, 2021, 41(8): 2330-2337.
[14]	黄程程, 董霄霄, 李钊. 基于二维Winograd算法的深流水线5×5卷积方法[J]. 计算机应用, 2021, 41(8): 2258-2264.
[15]	曾祥银, 郑伯川, 刘丹. 基于深度卷积神经网络和聚类的左右轨道线检测[J]. 计算机应用, 2021, 41(8): 2324-2329.

优化三元组损失的深度距离度量学习方法

Deep distance metric learning method based on optimized triplet loss

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 5

参考文献 21

相关文章 15

编辑推荐

Metrics