用于胸片中硬负样本识别的双支分布一致性对比学习模型

doi:10.11772/j.issn.1001-9081.2024070968

《计算机应用》唯一官方网站 ›› 2025, Vol. 45 ›› Issue (7): 2369-2377.DOI: 10.11772/j.issn.1001-9081.2024070968

• 多媒体计算与计算机仿真 • 上一篇下一篇

用于胸片中硬负样本识别的双支分布一致性对比学习模型

谢劲¹, 褚苏荣¹, 强彦¹^,²(), 赵涓涓¹^,³, 张华⁴, 高勇⁵

^1.太原理工大学计算机科学与技术学院（大数据学院），太原 030600
^2.中北大学软件学院，太原 030051
^3.晋中信息学院信息工程学院，山西晋中 030800
^4.山西医科大学第一医院CT影像科，太原 030012
^5.国药同煤总医院呼吸与危重症医学科，山西大同 037000

收稿日期:2024-07-09 修回日期:2024-09-25 接受日期:2024-09-29 发布日期:2025-07-10 出版日期:2025-07-10
通讯作者: 强彦
作者简介:谢劲（1999—），男，江苏如皋人，硕士研究生，主要研究方向：计算机视觉、图像处理
褚苏荣（1990—），女，山西太原人，博士研究生，主要研究方向：计算机视觉、图像处理
赵涓涓（1975—），女，山西太原人，教授，博士，CCF高级会员，主要研究方向：智能信息处理、图像处理
张华（1973—），女，山西太原人，主任医师，硕士，主要研究方向：心脑血管病CT影像诊断
高勇（1974—），男，山西大同人，副主任医师，硕士，主要研究方向：呼吸与危重症医学。
基金资助:
国家自然科学基金资助项目(U21A20469);国家自然科学基金资助项目(62376183);国家卫生健康委尘肺病重点实验室开放课题(YKFKT004);中央引导地方科技发展基金资助项目(YDZJSX2022C004);山西省科技创新人才团队专项(202304051001009)

Dual-branch distribution consistency contrastive learning model for hard negative sample identification in chest X-rays

Jin XIE¹, Surong CHU¹, Yan QIANG¹^,²(), Juanjuan ZHAO¹^,³, Hua ZHANG⁴, Yong GAO⁵

^1.College of Computer Science and Technology （College of Data Science），Taiyuan University of Technology，Taiyuan Shanxi 030600，China
^2.School of Software，North University of China，Taiyuan Shanxi 030051，China
^3.School of Information Engineering，Jinzhong College of Information，Jinzhong Shanxi 030800，China
^4.Department of CT Radiology，First Hospital of Shanxi Medical University，Taiyuan Shanxi 030012，China
^5.Department of Respiratory and Critical Care Medicine，Sinopharm Tongmei General Hospital，Datong Shanxi 037000，China

Received:2024-07-09 Revised:2024-09-25 Accepted:2024-09-29 Online:2025-07-10 Published:2025-07-10
Contact: Yan QIANG
About author:XIE Jin， born in 1999， M. S. candidate. His research interests include computer vision， image processing.
CHU Surong， born in 1990， Ph. D. candidate. Her research interests include computer vision， image processing.
ZHAO Juanjuan，born in 1975， Ph. D.， professor. Her research interests include intelligent information processing， image processing.
ZHANG Hua， born in 1973， M. S.， chief physician. Her research interests include CT imaging diagnosis of cardiovascular and cerebrovascular diseases.
GAO Yong， born in 1974， M. S.， deputy chief physician. His research interests include respiratory and critical care medicine.
Supported by:
National Natural Science Foundation of China(U21A20469);Open Project of NHC Key Laboratory of Pneumoconiosis(YKFKT004);Central Government Guiding Local Science and Technology Development Fund(YDZJSX2022C004);Special Project of Science and Technology Innovation Teams of Shanxi Province(202304051001009)

摘要/Abstract

摘要：

针对对比学习（CL）方法在医学图像中难以区分相似胸片样本以及难以识别微小病灶的问题，提出一种双支分布一致性对比学习模型（TCL）。首先，利用inpainting和outpainting数据增强策略强化模型对肺部纹理的关注，提高模型对复杂结构的识别能力；其次，利用协同学习方法进一步增强模型对肺部微小病灶的敏感性，捕捉不同视角下的病灶信息；最后，利用Student-t分布的重尾特性，对硬负样本进行区分，以约束不同增强视图与样本之间的一致性分布，从而加强硬负样本与其他样本之间的特征关系的学习，并减小硬负样本对模型的影响。在pneumoconiosis、NIH （National Institutes of Health）、Chest X-Ray Images （Pneumonia）和COVID-19 （Corona Virus Disease 2019）这4个胸片数据集上的实验结果表明，相较于MoCo v2 （Momentum Contrastive learning）模型，TCL模型的准确性分别提高了6.14%、3.08%、0.65%和4.67%，而迁移性能在COVID-19数据集上在标签率为5%、20%和50%时分别提高了4.10%、0.61%和8.41%。此外，通过CAM（Class Activation Mapping）可视化验证了TCL模型能关注重要病理区域，验证了所提模型的有效性。

关键词: 自监督学习, 对比学习, 医学图像处理, 硬负样本, 分布一致性

Abstract:

To address the issues of Contrastive Learning （CL） methods struggling to distinguish similar chest X-ray samples and detect tiny lesions in medical images， a dual-branch distribution consistency contrastive learning model （TCL） was proposed. Firstly， inpainting and outpainting data augmentation strategies were employed to strengthen the model’s focus on lung textures， thereby improving the model’s ability to recognize complex structures. Secondly， a collaborative learning approach was used to further enhance the model’s sensitivity to tiny lesions in lungs， thereby capturing lesion information from different perspectives. Finally， the heavy-tailed characteristic of Student-t distribution was utilized to differentiate hard negative samples， so as to constrain the consistency of distributions among different augmented views and samples， thereby reinforcing the learning of feature relationships among hard negatives and other samples， and reducing the influence of hard negatives on the model. Experimental results on four chest X-ray datasets， including pneumoconiosis， NIH （National Institutes of Health）， Chest X-Ray Images （Pneumonia）， and COVID-19 （Corona Virus Disease 2019）， demonstrate that compared to MoCo v2 （Momentum Contrastive Learning） model， TCL model improves the accuracy by 6.14%， 3.08%， 0.65%， and 4.67%， respectively， and in terms of transfer performance on COVID-19 dataset， TCL model achieves improvements of 4.10%， 0.61%， and 8.41%， respectively， at label rate of 5%， 20%， and 50%. Furthermore， CAM （Class Activation Mapping） visualization verifies that TCL model focuses on critical pathological regions effectively， confirming the model’s effectiveness.

Key words: Self Supervised Learning (SSL), Contrastive Learning (CL), medical image processing, hard negative sample, distribution consistency

中图分类号:

TP391.4

谢劲, 褚苏荣, 强彦, 赵涓涓, 张华, 高勇. 用于胸片中硬负样本识别的双支分布一致性对比学习模型[J]. 计算机应用, 2025, 45(7): 2369-2377.

Jin XIE, Surong CHU, Yan QIANG, Juanjuan ZHAO, Hua ZHANG, Yong GAO. Dual-branch distribution consistency contrastive learning model for hard negative sample identification in chest X-rays[J]. Journal of Computer Applications, 2025, 45(7): 2369-2377.

图/表 15

图1 无尘肺与尘肺一期胸片

Fig. 1 Chest X-rays of non-pneumoconiosis and stage I pneumoconiosis

图2 本文模型的总体框架

Fig. 2 Overall framework of proposed model

图3 协同学习框架

Fig. 3 Collaborative learning framework

图4 不同的处理硬负样本的方法示意图

Fig. 4 Schematic representation of different methods for handling hard negative samples

图5 TCL Function框架

Fig. 5 TCL Function framework

表1 不同方法的分类准确率对比 ( %)

Tab. 1 Comparison of classification accuracy of different methods

方法	不同数据集上的准确率
方法	pneumoconiosis	NIH	Pneumonia	COVID-19
Supervised^［34］	77.68	84.98	98.75	97.62
MoCo v2^［22］	67.18	66.42	89.67	85.49
SimCLR^［21］	64.82	65.48	84.27	82.61
ReSSL^［24］	68.37	66.59	88.62	86.43
BYOL^［23］	66.93	66.35	83.59	85.22
CPC v2^［35］	67.86	63.51	87.42	86.32
TCL	71.31	68.47	90.26	89.49

表2 迁移学习结果 (%)

Tab. 2 Transfer learning results

源域	标签比率	目标域	方法	准确率
NIH	5	COVID-19	MoCo v2	66.47
		COVID-19	TCL	69.20
		Pneumonia	MoCo v2	63.29
		Pneumonia	TCL	67.71
NIH	20	COVID-19	MoCo v2	75.91
		COVID-19	TCL	76.38
		Pneumonia	MoCo v2	69.83
		Pneumonia	TCL	72.47
NIH	50	COVID-19	MoCo v2	79.28
		COVID-19	TCL	85.95
		Pneumonia	MoCo v2	81.42
		Pneumonia	TCL	83.65

表3 不同数据增强方法组合的结果

Tab. 3 Results of different data augmentation method combinations

实验	强数据增强			弱数据增强			准确率/%
实验	Inpainting	随机裁剪	Outpainting	高斯模糊	水平翻转	随机旋转	准确率/%
实验1	√		√	√			69.75
实验2	√	√		√	√		65.62
实验3		√	√			√	67.27
实验4	√		√		√	√	71.31
实验5		√	√	√	√		67.46
实验6	√	√			√		67.92

图6 pneumoconiosis数据集上不同方法的loss曲线

Fig. 6 Loss curves of different methods on pneumoconiosis dataset

图7 4种胸片数据集上3种对比学习方法的AUC曲线

Fig. 7 AUC curves of three contrastive learning methods on four chest X-ray image datasets

图8 胸部X线图像的可视化结果

Fig. 8 Visualization results of chest X-ray images

表4 消融实验结果 ( %)

Tab. 4 Ablation experimental results

方法	不同数据集上的准确率
方法	pneumoconiosis	NIH	Pneumonia	COVID-19
P.S	61.59	60.75	81.38	83.29
T	60.85	57.64	76.29	72.37
CL P.S+T	61.61 64.89	60.53 61.72	82.69 83.65	81.62 84.73
T+CL	62.57	63.24	84.58	83.26
P.S+CL	62.14	61.82	84.92	84.41
Full（w/o MIL）	64.94	63.73	85.37	87.29
Full（w/o AML）	66.32	66.92	86.49	86.52
Full model	71.31	68.47	90.26	89.49

表5 不同模块对迁移性的影响 ( %)

Tab. 5 Influence of different modules on transferability

方法	准确率
TCL（w/o P.S）	77.16
TCL（w/o T）	71.35
TCL（w/o CL）	74.61
TCL	81.27

表6 超参数α对模型的影响

Tab. 6 Hyperparameter α influence on model performance

$α$	准确率/%	$α$	准确率/%	$α$	准确率/%
0.0	67.76	0.4	69.86	0.8	69.73
0.1	68.36	0.5	70.35	0.9	69.79
0.2	68.72	0.6	71.31	1.0	68.61
0.3	69.41	0.7	70.49

表6 超参数α对模型的影响

Tab. 6 Hyperparameter α influence on model performance

$α$	准确率/%	$α$	准确率/%	$α$	准确率/%
0.0	67.76	0.4	69.86	0.8	69.73
0.1	68.36	0.5	70.35	0.9	69.79
0.2	68.72	0.6	71.31	1.0	68.61
0.3	69.41	0.7	70.49

图9 超参数τ和τm对模型的影响

Fig. 9 Hyperparameters τ and τm influence on model performance

参考文献 35

[1]	ORAÑO J F V， ORAÑO-MAAGHOP J F V， MARAVILLAS E A. CXR-based lung disease classification using convolutional neural network ［C］// Proceedings of the IEEE 14th International Conference on Humanoid， Nanotechnology， Information Technology， Communication and Control， Environment， and Management. Piscataway： IEEE， 2022： 1-5.
[2]	RATNATUNGA C N， LUTZKY V P， KUPZ A， et al. The rise of non-tuberculosis mycobacterial lung disease ［J］. Frontiers in Immunology， 2020， 11： No.303.
[3]	QUEVEDO S， DOMÍNGEZ F， PELAEZ E. Detecting multi thoracic diseases in chest x-ray images using deep learning techniques ［C］// Proceedings of the IEEE 13th International Conference on Pattern Recognition Systems. Piscataway： IEEE， 2023： 1-7.
[4]	ASAITHAMBI A， THAMILARASI V. Classification of lung chest X-ray images using deep learning with efficient optimizers ［C］// Proceedings of the IEEE 13th Annual Computing and Communication Workshop and Conference. Piscataway： IEEE， 2023： 465-469.
[5]	YANG F， LU P X， DENG M， et al. Annotations of lung abnormalities in the Shenzhen chest X-ray dataset for computer-aided screening of pulmonary diseases ［J］. Data， 2022， 7（7）： No.95.
[6]	LIU C， WEN J， LUO X， et al. DICNet： deep instance-level contrastive network for double incomplete multi-view multi-label classification ［C］// Proceedings of the 37th AAAI Conference on Artificial Intelligence. Palo Alto： AAAI Press， 2023： 8807-8815.
[7]	HUANG Z， ZHANG J， SHAN H. Twin contrastive learning with noisy labels ［C］// Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2023： 11661-11670.
[8]	LIAO H， WANG Q， ZHAO S， et al. Domain consensual contrastive learning for few-shot universal domain adaptation ［J］. Applied Intelligence， 2023， 53（22）： 27191-27206.
[9]	韩滕跃，牛少彰，张文.基于对比学习的多模态序列推荐算法［J］.计算机应用，2022， 42（6）： 1683-1688.
	HAN T Y， NIU S Z， ZHANG W. Multimodal sequential recommendation algorithm based on contrastive learning ［J］. Journal of Computer Applications， 2022， 42（6）： 1683-1688.
[10]	JAISWAL A， BABU A R， ZADEH M Z， et al. A survey on contrastive self-supervised learning ［J］. Technologies， 2020， 9（1）： No.2.
[11]	LIU X， ZHANG F， HOU Z， et al. Self-supervised learning： generative or contrastive ［J］. IEEE Transactions on Knowledge and Data Engineering， 2023， 35（1）： 857-876.
[12]	HAN Y， CHEN C， TEWFIK A， et al. Knowledge-augmented contrastive learning for abnormality classification and localization in chest X-rays with radiomics using a feedback loop ［C］// Proceedings of the 2022 IEEE/CVF Winter Conference on Applications of Computer Vision. Piscataway： IEEE， 2022： 1789-1798.
[13]	WEI Z， PARK S， KIM J. A triplet contrast learning of global and local representations for unannotated medical images ［C］// Proceedings of the 2022 International Workshop on PRedictive Intelligence in MEdicine， LNCS 13564. Cham： Springer， 2022： 181-190.
[14]	ZENG D， KHEIR J N， ZENG P， et al. Contrastive learning with temporal correlated medical images： a case study using lung segmentation in chest X-rays （invited paper）［C］// Proceedings of the 2021 IEEE/ACM International Conference on Computer Aided Design. Piscataway： IEEE， 2021： 1-7.
[15]	WU H， XIAO F， LIANG C. Dual contrastive learning with anatomical auxiliary supervision for few-shot medical image segmentation ［C］// Proceedings of the 2022 European Conference on Computer Vision， LNCS 13680. Cham： Springer， 2022： 417-434.
[16]	ZHANG Y， JIANG H， MIURA Y， et al. Contrastive learning of medical visual representations from paired images and text ［C］// Proceedings of the 7th Machine Learning for Healthcare Conference. New York： JMLR.org， 2022： 2-25.
[17]	HE Y， YANG G， GE R， et al. Geometric visual similarity learning in 3D medical image self-supervised pre-training ［C］// Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2023： 9538-9547.
[18]	WEI X， NIU X， ZHANG X， et al. Deep pneumonia： attention-based contrastive learning for class-imbalanced pneumonia lesion recognition in chest X-rays ［C］// Proceedings of the 2022 IEEE International Conference on Big Data. Piscataway： IEEE， 2022： 5361-5369.
[19]	刘兆伟，方艳红，郑明宇，等.基于注意力机制与多任务的肺部疾病诊断方法［J］.计算机工程，2025， 51（1）： 332-342.
	LIU Z W， FANG Y H， ZHENG M Y， et al. Lung disease diagnosis method based on attention mechanism and multi-tasking ［J］. Computer Engineering， 2025， 51（1）： 332-342.
[20]	ROBINSON J， CHUANG C Y， SRA S， et al. Contrastive learning with hard negative samples ［EB/OL］. ［2024-01-24］. .
[21]	CHEN T， KORNBLITH S， NOROUZI M， et al. A simple framework for contrastive learning of visual representations ［C］// Proceedings of the 37th International Conference on Machine Learning. New York： JMLR.org， 2020： 1597-1607.
[22]	HE K， FAN H， WU Y， et al. Momentum contrast for unsupervised visual representation learning ［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 9726-9735.
[23]	GRILL J B， STRUB F， ALTCHÉ F， et al. Bootstrap your own latent — a new approach to self-supervised learning ［C］// Proceedings of the 34th International Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2020： 21271-21284.
[24]	ZHENG M， YOU S， WANG F， et al. ReSSL： relational self-supervised learning with weak augmentation ［C］// Proceedings of the 35th International Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2021： 2543-2555.
[25]	CUI W， BAI L， YANG X， et al. A new contrastive learning framework for reducing the effect of hard negatives ［J］. Knowledge-Based Systems， 2023， 260： No.110121.
[26]	TU W， ZHOU S， LIU X， et al. Hierarchically contrastive hard sample mining for graph self-supervised pretraining ［J］. IEEE Transactions on Neural Networks and Learning Systems， 2024， 35（11）： 16748-16761.
[27]	KALANTIDIS Y， SARIYILDIZ M B， PION N， et al. Hard negative mixing for contrastive learning ［C］// Proceedings of the 34th International Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2020： 21798-21809.
[28]	LI J， ZHAO G， TAO Y， et al. Multi-task contrastive learning for automatic CT and X-ray diagnosis of COVID-19 ［J］. Pattern Recognition， 2021， 114： No.107848.
[29]	LI D. Attention-enhanced architecture for improved pneumonia detection in chest X-ray images ［J］. BMC Medical Imaging， 2024， 24： No.6.
[30]	PACKHÄUSER K， FOLLE L， THAMM F， et al. Generation of anonymous chest radiographs using latent diffusion models for training thoracic abnormality classification systems ［C］// Proceedings of the IEEE 20th International Symposium on Biomedical Imaging. Piscataway： IEEE， 2023： 1-5.
[31]	WANG X， PENG Y， LU L， et al. ChestX-ray8： hospital-scale chest X-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases ［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 3462-3471.
[32]	KERMANY D S， GOLDBAUM M， CAI W， et al. Identifying medical diagnoses and treatable diseases by image-based deep learning ［J］. Cell， 2018， 172（5）： 1122-1131.e9.
[33]	CHOWDHURY M E H， RAHMAN T， KHANDAKAR A， et al. Can AI help in screening viral and COVID-19 pneumonia？［J］. IEEE Access， 2020， 8： 132665-132676.
[34]	HE K M， ZHANG X， REN S， et al. Deep residual learning for image recognition ［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 770-778.
[35]	HÉNAFF O J， SRINIVAS A， DE FAUW J， et al. Data-efficient image recognition with contrastive predictive coding ［C］// Proceedings of the 37th International Conference on Machine Learning. New York： ACM， 2020： 4182-4192.

用于胸片中硬负样本识别的双支分布一致性对比学习模型

Dual-branch distribution consistency contrastive learning model for hard negative sample identification in chest X-rays

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 15

参考文献 35

相关文章 15

编辑推荐

Metrics

[1]	王祉苑, 彭涛, 杨捷. 分布外检测中训练与测试的内外数据整合[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2497-2506.
[2]	申奥, 黄瑞章, 薛菁菁, 陈艳平, 秦永彬. 基于分布增强的深度变分文本聚类模型[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2457-2463.
[3]	王震洲, 郭方方, 宿景芳, 苏鹤, 王建超. 面向智能巡检的视觉模型鲁棒性优化方法[J]. 《计算机应用》唯一官方网站, 2025, 45(7): 2361-2368.
[4]	姜超英, 李倩, 刘宁, 刘磊, 崔立真. 基于图对比学习的再入院预测模型[J]. 《计算机应用》唯一官方网站, 2025, 45(6): 1784-1792.
[5]	余明峰, 秦永彬, 黄瑞章, 陈艳平, 林川. 基于对比学习增强双注意力机制的多标签文本分类方法[J]. 《计算机应用》唯一官方网站, 2025, 45(6): 1732-1740.
[6]	颜文婧, 王瑞东, 左敏, 张青川. 基于风味嵌入异构图层次学习的食谱推荐模型[J]. 《计算机应用》唯一官方网站, 2025, 45(6): 1869-1878.
[7]	吴宗航, 张东, 李冠宇. 基于联合自监督学习的多模态融合推荐算法[J]. 《计算机应用》唯一官方网站, 2025, 45(6): 1858-1868.
[8]	龙雨菲, 牟宇辰, 刘晔. 基于张量化图卷积网络和对比学习的多源数据表示学习模型[J]. 《计算机应用》唯一官方网站, 2025, 45(5): 1372-1378.
[9]	胡文彬, 蔡天翔, 韩天乐, 仲兆满, 马常霞. 融合对比学习与情感分析的多模态反讽检测模型[J]. 《计算机应用》唯一官方网站, 2025, 45(5): 1432-1438.
[10]	杨光局, 罗天健, 王开军, 杨思琪. 多分支多视图的时间序列上下文对比表征学习方法[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1042-1052.
[11]	田仁杰, 景明利, 焦龙, 王飞. 基于混合负采样的图对比学习推荐算法[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1053-1060.
[12]	党伟超, 温鑫瑜, 高改梅, 刘春霞. 基于多视图多尺度对比学习的图协同过滤[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1061-1068.
[13]	朱俊屹, 常雷雷, 徐晓滨, 郝智勇, 于海跃, 姜江. 基于最小先验知识的自监督学习方法[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1035-1041.
[14]	王元龙, 刘亭华, 张虎. 基于跨模态对比学习的常识问答模型[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 732-738.
[15]	陈维, 施昌勇, 马传香. 基于多模态数据融合的农作物病害识别方法[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 840-848.