Journal of Computer Applications

    Next Articles

Dual-branch distribution consistency contrastive learning network for hard negative sample identification in chest X-rays

  

  • Received:2024-07-09 Revised:2024-09-05 Online:2024-11-19 Published:2024-11-19

用于胸片中硬负样本识别的双支分布一致性对比学习网络

谢劲1,褚苏荣2,强彦1,赵涓涓1,张华3,高勇4   

  1. 1. 太原理工大学
    2. 太原理工大学信息与计算机学院
    3. 山西医科大学第一医院
    4. 国药同煤总医院
  • 通讯作者: 谢劲
  • 基金资助:
    国家自然科学基金;国家自然科学基金;国家卫生健康委尘肺病重点实验室开放课题;中央引导地方科技发展资金;国家卫生健康委尘肺病重点实验室;山西省基础研究计划项目;山西省科技创新人才团队专项资助

Abstract: To address the issue of Contrastive Learning (CL) methods struggling to distinguish similar chest X-ray samples and detect tiny lesions in medical images, a Dual-Branch Distribution Consistency Contrastive Learning model (TCL) was proposed. Firstly, inpainting and outpainting data augmentation strategies were employed to strengthen the model's focus on lung textures, improving its ability to recognize complex structures. Secondly, a collaborative learning approach was used to enhance the model's sensitivity to tiny lesions in the lungs, capturing lesion information from different perspectives. Finally, the heavy-tailed characteristics of the Student-t distribution were utilized to differentiate hard negative samples, constraining the consistency of distributions between different augmented views and samples, thereby reinforcing the feature relationships between hard negatives and other samples and reducing the impact of hard negatives on the model. Experimental results on four chest X-ray datasets, including Pneumoconiosis, NIH Chest X-ray Dataset, Chest X-Ray Images (Pneumonia), and COVID-19 Radiography Database, demonstrate that the TCL model improves accuracy by 6.14%, 3.08%, 0.65%, and 4.67% compared to the Moco v2 (Momentum Contrastive Learning) method. In terms of transfer performance on the COVID-19 dataset, TCL achieves improvements of 4.10%, 0.61%, and 8.41% at label rates of 5%, 20%, and 50%, respectively. Furthermore, CAM visualization verifies that the TCL model effectively focuses on critical pathological regions, confirming the method's effectiveness.

Key words: Self Supervised Learning &#40, SSL&#41, Contrastive Learning &#40, CL&#41

摘要: 针对对比学习方法(Contrastive Learning, CL)在医学图像中难以区分相似胸片样本以及难以识别微小病灶的问题,提出了一种双支分布一致性对比学习模型(TCL)。首先,利用inpainting、outpainting数据增强策略强化模型对肺部纹理的关注,提高模型对复杂结构的识别能力;其次,利用协同学习方法进一步增强模型对肺部微小病灶的敏感性,捕捉不同视角下的病灶信息;最后,利用student-t分布重尾特性,对硬负样本进行区分,约束不同增强视图与样本之间的一致性分布,加强学习硬负样本与其余样本之间的特征关系,减小硬负样本对模型的影响。在尘肺病、NIH(National Institutes of Health Chest X-Ray Dataset)、Chest X-Ray Images (Pneumonia)以及COVID-19 (Corona Virus Disease 2019)四个胸片数据集上的实验结果表明,TCL模型相较于Moco v2(Momentum Contrastive learning)方法,准确性分别提高了6.14%、3.08%、0.65%和4.67%。迁移性能在Covid-19数据集中针对标签率为5%、20%、50%时分别提高了4.10%、0.61%、8.41%。并且通过CAM可视化验证了TCL模型能够关注重要病理区域,证明了所提方法的有效性。

关键词: 自监督学习, 对比学习, 医学图像, 硬负样本, 分布一致性

CLC Number: