《计算机应用》唯一官方网站 ›› 2025, Vol. 45 ›› Issue (7): 2369-2377.DOI: 10.11772/j.issn.1001-9081.2024070968

• 多媒体计算与计算机仿真 • 上一篇    下一篇

用于胸片中硬负样本识别的双支分布一致性对比学习模型

谢劲1, 褚苏荣1, 强彦1,2(), 赵涓涓1,3, 张华4, 高勇5   

  1. 1.太原理工大学 计算机科学与技术学院(大数据学院),太原 030600
    2.中北大学 软件学院,太原 030051
    3.晋中信息学院 信息工程学院,山西 晋中 030800
    4.山西医科大学 第一医院CT影像科,太原 030012
    5.国药同煤总医院 呼吸与危重症医学科,山西 大同 037000
  • 收稿日期:2024-07-09 修回日期:2024-09-25 接受日期:2024-09-29 发布日期:2025-07-10 出版日期:2025-07-10
  • 通讯作者: 强彦
  • 作者简介:谢劲(1999—),男,江苏如皋人,硕士研究生,主要研究方向:计算机视觉、图像处理
    褚苏荣(1990—),女,山西太原人,博士研究生,主要研究方向:计算机视觉、图像处理
    强彦(1969—),男,山西古交人,教授,博士,CCF会员,主要研究方向:图像处理、云计算 qiangyan@tyut.edu.cn
    赵涓涓(1975—),女,山西太原人,教授,博士,CCF高级会员,主要研究方向:智能信息处理、图像处理
    张华(1973—),女,山西太原人,主任医师,硕士,主要研究方向:心脑血管病CT影像诊断
    高勇(1974—),男,山西大同人,副主任医师,硕士,主要研究方向:呼吸与危重症医学。
  • 基金资助:
    国家自然科学基金资助项目(U21A20469);国家自然科学基金资助项目(62376183);国家卫生健康委尘肺病重点实验室开放课题(YKFKT004);中央引导地方科技发展基金资助项目(YDZJSX2022C004);山西省科技创新人才团队专项(202304051001009)

Dual-branch distribution consistency contrastive learning model for hard negative sample identification in chest X-rays

Jin XIE1, Surong CHU1, Yan QIANG1,2(), Juanjuan ZHAO1,3, Hua ZHANG4, Yong GAO5   

  1. 1.College of Computer Science and Technology (College of Data Science),Taiyuan University of Technology,Taiyuan Shanxi 030600,China
    2.School of Software,North University of China,Taiyuan Shanxi 030051,China
    3.School of Information Engineering,Jinzhong College of Information,Jinzhong Shanxi 030800,China
    4.Department of CT Radiology,First Hospital of Shanxi Medical University,Taiyuan Shanxi 030012,China
    5.Department of Respiratory and Critical Care Medicine,Sinopharm Tongmei General Hospital,Datong Shanxi 037000,China
  • Received:2024-07-09 Revised:2024-09-25 Accepted:2024-09-29 Online:2025-07-10 Published:2025-07-10
  • Contact: Yan QIANG
  • About author:XIE Jin, born in 1999, M. S. candidate. His research interests include computer vision, image processing.
    CHU Surong, born in 1990, Ph. D. candidate. Her research interests include computer vision, image processing.
    QIANG Yan, born in 1969, Ph. D., professor. His research interests include image processing,cloud computing.
    ZHAO Juanjuan,born in 1975, Ph. D., professor. Her research interests include intelligent information processing, image processing.
    ZHANG Hua, born in 1973, M. S., chief physician. Her research interests include CT imaging diagnosis of cardiovascular and cerebrovascular diseases.
    GAO Yong, born in 1974, M. S., deputy chief physician. His research interests include respiratory and critical care medicine.
  • Supported by:
    National Natural Science Foundation of China(U21A20469);Open Project of NHC Key Laboratory of Pneumoconiosis(YKFKT004);Central Government Guiding Local Science and Technology Development Fund(YDZJSX2022C004);Special Project of Science and Technology Innovation Teams of Shanxi Province(202304051001009)

摘要:

针对对比学习(CL)方法在医学图像中难以区分相似胸片样本以及难以识别微小病灶的问题,提出一种双支分布一致性对比学习模型(TCL)。首先,利用inpainting和outpainting数据增强策略强化模型对肺部纹理的关注,提高模型对复杂结构的识别能力;其次,利用协同学习方法进一步增强模型对肺部微小病灶的敏感性,捕捉不同视角下的病灶信息;最后,利用Student-t分布的重尾特性,对硬负样本进行区分,以约束不同增强视图与样本之间的一致性分布,从而加强硬负样本与其他样本之间的特征关系的学习,并减小硬负样本对模型的影响。在pneumoconiosis、NIH (National Institutes of Health)、Chest X-Ray Images (Pneumonia)和COVID-19 (Corona Virus Disease 2019)这4个胸片数据集上的实验结果表明,相较于MoCo v2 (Momentum Contrastive learning)模型,TCL模型的准确性分别提高了6.14%、3.08%、0.65%和4.67%,而迁移性能在COVID-19数据集上在标签率为5%、20%和50%时分别提高了4.10%、0.61%和8.41%。此外,通过CAM(Class Activation Mapping)可视化验证了TCL模型能关注重要病理区域,验证了所提模型的有效性。

关键词: 自监督学习, 对比学习, 医学图像处理, 硬负样本, 分布一致性

Abstract:

To address the issues of Contrastive Learning (CL) methods struggling to distinguish similar chest X-ray samples and detect tiny lesions in medical images, a dual-branch distribution consistency contrastive learning model (TCL) was proposed. Firstly, inpainting and outpainting data augmentation strategies were employed to strengthen the model’s focus on lung textures, thereby improving the model’s ability to recognize complex structures. Secondly, a collaborative learning approach was used to further enhance the model’s sensitivity to tiny lesions in lungs, thereby capturing lesion information from different perspectives. Finally, the heavy-tailed characteristic of Student-t distribution was utilized to differentiate hard negative samples, so as to constrain the consistency of distributions among different augmented views and samples, thereby reinforcing the learning of feature relationships among hard negatives and other samples, and reducing the influence of hard negatives on the model. Experimental results on four chest X-ray datasets, including pneumoconiosis, NIH (National Institutes of Health), Chest X-Ray Images (Pneumonia), and COVID-19 (Corona Virus Disease 2019), demonstrate that compared to MoCo v2 (Momentum Contrastive Learning) model, TCL model improves the accuracy by 6.14%, 3.08%, 0.65%, and 4.67%, respectively, and in terms of transfer performance on COVID-19 dataset, TCL model achieves improvements of 4.10%, 0.61%, and 8.41%, respectively, at label rate of 5%, 20%, and 50%. Furthermore, CAM (Class Activation Mapping) visualization verifies that TCL model focuses on critical pathological regions effectively, confirming the model’s effectiveness.

Key words: Self Supervised Learning (SSL), Contrastive Learning (CL), medical image processing, hard negative sample, distribution consistency

中图分类号: