基于跨尺度注意力网络的胸部疾病分类方法

doi:10.11772/j.issn.1001-9081.2024071019

《计算机应用》唯一官方网站 ›› 2025, Vol. 45 ›› Issue (8): 2712-2719.DOI: 10.11772/j.issn.1001-9081.2024071019

• 多媒体计算与计算机仿真 • 上一篇

基于跨尺度注意力网络的胸部疾病分类方法

林进浩¹, 罗川¹(), 李天瑞², 陈红梅²

^1.四川大学计算机学院，成都 610065
^2.西南交通大学计算机与人工智能学院，成都 611756

收稿日期:2024-07-19 修回日期:2024-11-05 接受日期:2024-11-05 发布日期:2024-12-03 出版日期:2025-08-10
通讯作者: 罗川
作者简介:林进浩（1999—），男，广东阳江人，硕士研究生，主要研究方向：深度学习、医学图像处理
李天瑞（1969—），男，福建莆田人，教授，博士，主要研究方向：人工智能、数据挖掘、知识发现
陈红梅（1971—），女，四川成都人，教授，博士，主要研究方向：数据挖掘、粒计算。
基金资助:
国家自然科学基金资助项目(62476182);国家自然科学基金资助项目(62076171);国家自然科学基金资助项目(62376230);四川省自然科学基金资助项目(2022NSFSC0898)

Thoracic disease classification method based on cross-scale attention network

Jinhao LIN¹, Chuan LUO¹(), Tianrui LI², Hongmei CHEN²

^1.College of Computer Science，Sichuan University，Chengdu Sichuan 610065，China
^2.School of Computing and Artificial Intelligence，Southwest Jiaotong University，Chengdu Sichuan 611756，China

Received:2024-07-19 Revised:2024-11-05 Accepted:2024-11-05 Online:2024-12-03 Published:2025-08-10
Contact: Chuan LUO
About author:LIN Jinhao， born in 1999， M. S. candidate. His research interests include deep learning， medical image processing.
LI Tianrui， born in 1969， Ph. D.， professor. His research interests include artificial intelligence， data mining， knowledge discovery.
CHEN Hongmei， born in 1971， Ph. D.， professor. Her research interests include data mining， granular computing.
Supported by:
National Natural Science Foundation of China(62476182);Natural Science Foundation of Sichuan Province(2022NSFSC0898)

摘要/Abstract

摘要：

从胸部X光片中自动识别胸部疾病是计算机辅助诊断的重要研究领域。然而，现有的许多胸部疾病分类方法在处理病变区域大小差异方面存在困难，并且无法准确识别和定位不同疾病的病变区域。针对上述问题，提出一种基于跨尺度注意力网络（CANet）的胸部疾病分类方法。该方法使用DenseNet-121作为特征提取网络，并集成自感知注意力（SAA）、向上聚焦注意力（UFA）和向下引导注意力（DGA）3个主要模块。SAA模块通过提取与胸部疾病相关的通道和异常区域信息，细化空间位置上的病理特征，并减少不相关区域的干扰。为了实现不同尺度空间上下文信息的跨尺度交互，使用UFA和DGA模块进行图像特征校准。此外，提出空间注意力金字塔池化（SAPP）模块用于融合不同特征图的多尺度特征，从而提高胸部疾病的检测性能。在ChestX-ray14和DR-Pneumonia数据集上的实验结果表明，所提方法的平均曲线下面积（AUC）值分别达到了83.4%和82.6%，优于DualCheXNet、A³Net和CheXGAT等方法。具体地，与CheXGAT方法相比，所提方法的平均AUC值分别提高了0.7和0.1个百分点。可见，所提方法可以识别胸部X光片中的重要信息，有效提高了胸部疾病分类的性能。

关键词: 胸部疾病分类, 胸部X光片, 注意力机制, 卷积神经网络, 特征融合

Abstract:

Automatic identification of thoracic diseases from chest X-rays is a significant area of research in computer-aided diagnosis. However， many existing methods for thoracic disease classification struggle to handle differences in lesion area sizes and often fail to identify and localize the lesion areas of different diseases accurately. To address the above problems， a thoracic disease classification method based on Cross-scale Attention Network （CANet） was proposed. In this method， DenseNet-121 was employed as the feature extraction network， and three main modules were integrated： Self Aware Attention （SAA）， Upward Focus Attention （UFA）， and Downward Guidance Attention （DGA） modules. In the SAA module， the spatial pathological features were refined and the interference from irrelevant areas was reduced by extracting channel and abnormal area information relevant to thoracic diseases. In order to achieve cross-scale interaction of spatial context information of different scales， image feature calibration was performed using the UFA and DGA modules. Additionally， the Spatial Attention Pyramid Pooling （SAPP） module was proposed to fuse multi-scale features from different feature maps， thereby enhancing the detection performance for thoracic diseases. Experimental results on ChestX-ray14 and DR-Pneumonia datasets show that the proposed method has the average Area Under Curve （AUC） values of 83.4% and 82.6%， respectively， outperforming DualCheXNet， A³Net， and CheXGAT methods. Specifically， compared with CheXGAT method， the proposed method improves the average AUC values by 0.7 and 0.1 percentage points， respectively. It can be seen that the proposed method identifies critical information in chest X-rays effectively， improving the performance of thoracic disease classification significantly.

Key words: thoracic disease classification, chest X-ray, attention mechanism, Convolutional Neural Network (CNN), feature fusion

中图分类号:

TP391

林进浩, 罗川, 李天瑞, 陈红梅. 基于跨尺度注意力网络的胸部疾病分类方法[J]. 计算机应用, 2025, 45(8): 2712-2719.

Jinhao LIN, Chuan LUO, Tianrui LI, Hongmei CHEN. Thoracic disease classification method based on cross-scale attention network[J]. Journal of Computer Applications, 2025, 45(8): 2712-2719.

图/表 9

参考文献 29

[1]	MA X， ZHU L， KURCHE J S， et al. Global and regional burden of interstitial lung disease and pulmonary sarcoidosis from 1990 to 2019： results from the Global Burden of Disease study 2019［J］. Thorax， 2022， 77（6）： 596-605.
[2]	BRUNO M A， WALKER E A， ABUJUDEH H H. Understanding and confronting our mistakes： the epidemiology of error in radiology and strategies for error reduction［J］. RadioGraphics， 2015， 35（6）： 1668-1676.
[3]	WANG X， PENG Y， LU L， et al. ChestX-ray8： hospital-scale chest X-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 3462-3471.
[4]	RONNEBERGER O， FISCHER P， BROX T. U-Net： convolutional networks for biomedical image segmentation［C］// Proceedings of the 2015 International Conference on Medical Image Computing and Computer-Assisted Intervention， LNCS 9351. Cham： Springer， 2015： 234-241.
[5]	HUYNH L D， BOUTRY N. A U-net++ with pre-trained EfficientNet backbone for segmentation of diseases and artifacts in endoscopy images and videos［EB/OL］.［2024-10-11］. .
[6]	HE K， GKIOXARI G， DOLLÁR P， et al. Mask R-CNN［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017： 2980-2988.
[7]	RAJPURKAR P， IRVIN J， ZHU K， et al. CheXNet： radiologist-level pneumonia detection on chest X-rays with deep learning［EB/OL］. ［2024-05-23］..
[8]	KARADDI S H， SHARMA L D. Automated multi-class classification of lung diseases from CXR-images using pre-trained convolutional neural networks［J］. Expert Systems with Applications， 2023， 211： No.118650.
[9]	CHEN B， LI J， LU G， et al. Lesion location attention guided network for multi-label thoracic disease classification in chest X-rays［J］. IEEE Journal of Biomedical and Health Informatics， 2020， 24（7）： 2016-2027.
[10]	宋子岩，罗川，李天瑞，等. 基于注意力机制和双分支网络的胸部疾病分类［J］. 计算机科学， 2024， 51（11A）： No.230900116.
	SONG Z Y， LUO C， LI T R， et al. Classification of thoracic diseases based on attention mechanisms and two-branch networks［J］. Computer Science， 2024， 51（11A）： No.230900116.
[11]	CHEN B， LI J， LU G， et al. Label co-occurrence learning with graph convolutional networks for multi-label chest X-ray image classification［J］. IEEE Journal of Biomedical and Health Informatics， 2020， 24（8）： 2292-2302.
[12]	INDUMATHI V， SIVA R. An efficient lung disease classification from X-ray images using hybrid Mask-RCNN and BiDLSTM［J］. Biomedical Signal Processing and Control， 2023， 81： No.104340.
[13]	WANG H， WANG S， QIN Z， et al. Triple attention learning for classification of 14 thoracic diseases using chest radiography［J］. Medical Image Analysis， 2021， 67： No.101846.
[14]	CHEN B， LI J， GUO X， et al. DualCheXNet： dual asymmetric feature learning for thoracic disease classification in chest X-rays［J］. Biomedical Signal Processing and Control， 2019， 53： No.101554.
[15]	LIU H， WANG L， NAN Y， et al. SDFN： segmentation-based deep fusion network for thoracic disease classification in chest X-ray images［J］. Computerized Medical Imaging and Graphics， 2019， 75： 66-73.
[16]	FAISAL M， DARMAWAN J T， BACHROIN N， et al. CheXViT： CheXNet and Vision Transformer to multi-label chest X-ray image classification［C］// Proceedings of the 2023 IEEE International Symposium on Medical Measurements and Applications. Piscataway： IEEE， 2023： 1-6.
[17]	WANG H， JIA H， LU L， et al. Thorax-Net： an attention regularized deep neural network for classification of thoracic diseases on chest radiography［J］. IEEE Journal of Biomedical and Health Informatics， 2020， 24（2）： 475-485.
[18]	XU Y， LAM H K， JIA G. MANet： a two-stage deep learning method for classification of COVID-19 from chest X-ray images［J］. Neurocomputing， 2021， 443： 96-105.
[19]	MA C， WANG H， HOI S C H. Multi-label thoracic disease image classification with cross-attention networks［C］// Proceedings of the 2019 International Conference on Medical Image Computing and Computer-Assisted Intervention， LNCS 11769. Cham： Springer， 2019： 730-738.
[20]	SUN Z， QU L， LUO J， et al. Label correlation transformer for automated chest X-ray diagnosis with reliable interpretability［J］. La Radiologia Medica， 2023， 128（6）： 726-733.
[21]	GUAN Q， HUANG Y. Multi-label chest X-ray image classification via category-wise residual attention learning［J］. Pattern Recognition Letters， 2020， 130： 259-266.
[22]	CHEN B， ZHANG Z， LI Y， et al. Multi-label chest X-ray image classification via semantic similarity graph embedding［J］. IEEE Transactions on Circuits and Systems for Video Technology， 2022， 32（4）： 2455-2468.
[23]	LEE Y W， HUANG S K， CHANG R F. CheXGAT： a disease correlation-aware network for thorax disease diagnosis from chest X-ray images［J］. Artificial Intelligence in Medicine， 2022， 132： No.102382.
[24]	WANG Q， WU B， ZHU P， et al. ECA-Net： efficient channel attention for deep convolutional neural networks［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 11531-11539.
[25]	HU J， SHEN L， SUN G. Squeeze-and-excitation networks［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 7132-7141.
[26]	RIDNIK T， BEN-BARUCH E， ZAMIR N， et al. Asymmetric loss for multi-label classification［C］// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2021： 82-91.
[27]	HE K， ZHANG X， REN S， et al. Spatial pyramid pooling in deep convolutional networks for visual recognition［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2015， 37（9）： 1904-1916.
[28]	LIN T Y， GOYAL P， GIRSHICK R， et al. Focal loss for dense object detection［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017： 2999-3007.
[29]	SELVARAJU R R， COGSWELL M， DAS A， et al. Grad-CAM： visual explanations from deep networks via gradient-based localization［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017： 618-626.

方法	Atel	Card	Effu	Infi	Mass	Nodu	Pne1	Pne2	Cons	Edema	Emph	Fibr	PT	Hernia	平均
CheXNet^［7］	0.769	0.885	0.825	0.694	0.824	0.759	0.715	0.852	0.745	0.842	0.906	0.821	0.766	0.901	0.807
SDFN^［15］	0.781	0.885	0.832	0.700	0.815	0.765	0.719	0.866	0.743	0.842	0.921	0.835	0.791	0.911	0.815
CRAL^［21］	0.781	0.880	0.829	0.702	0.834	0.773	0.729	0.857	0.754	0.850	0.908	0.830	0.778	0.917	0.816
CAN^［19］	0.777	0.894	0.829	0.696	0.838	0.771	0.722	0.862	0.750	0.846	0.908	0.827	0.779	0.934	0.817
DualCheXNet^［14］	0.784	0.888	0.831	0.705	0.838	0.796	0.727	0.876	0.746	0.852	0.942	0.837	0.796	0.912	0.823
LLAGnet^［9］	0.783	0.885	0.834	0.703	0.841	0.790	0.729	0.877	0.754	0.851	0.939	0.832	0.798	0.916	0.824
A³Net^［13］	0.779	0.895	0.836	0.710	0.834	0.777	0.737	0.878	0.759	0.855	0.933	0.838	0.791	0.938	0.826
CheXGCN^［11］	0.786	0.893	0.832	0.699	0.840	0.800	0.739	0.876	0.751	0.850	0.944	0.834	0.795	0.929	0.826
CheXGAT^［23］	0.787	0.879	0.837	0.699	0.839	0.793	0.741	0.879	0.755	0.851	0.945	0.842	0.794	0.931	0.827
SSGE^［22］	0.792	0.892	0.840	0.714	0.848	0.812	0.733	0.885	0.753	0.848	0.948	0.827	0.795	0.932	0.830
本文方法	0.823	0.887	0.873	0.703	0.850	0.794	0.741	0.877	0.798	0.882	0.924	0.837	0.773	0.917	0.834

方法	Atel	Card	Effu	Infi	Mass	Nodu	Pne1	Pne2	Cons	Edema	Emph	Fibr	PT	Hernia	平均
CheXNet^［7］	0.769	0.885	0.825	0.694	0.824	0.759	0.715	0.852	0.745	0.842	0.906	0.821	0.766	0.901	0.807
SDFN^［15］	0.781	0.885	0.832	0.700	0.815	0.765	0.719	0.866	0.743	0.842	0.921	0.835	0.791	0.911	0.815
CRAL^［21］	0.781	0.880	0.829	0.702	0.834	0.773	0.729	0.857	0.754	0.850	0.908	0.830	0.778	0.917	0.816
CAN^［19］	0.777	0.894	0.829	0.696	0.838	0.771	0.722	0.862	0.750	0.846	0.908	0.827	0.779	0.934	0.817
DualCheXNet^［14］	0.784	0.888	0.831	0.705	0.838	0.796	0.727	0.876	0.746	0.852	0.942	0.837	0.796	0.912	0.823
LLAGnet^［9］	0.783	0.885	0.834	0.703	0.841	0.790	0.729	0.877	0.754	0.851	0.939	0.832	0.798	0.916	0.824
A³Net^［13］	0.779	0.895	0.836	0.710	0.834	0.777	0.737	0.878	0.759	0.855	0.933	0.838	0.791	0.938	0.826
CheXGCN^［11］	0.786	0.893	0.832	0.699	0.840	0.800	0.739	0.876	0.751	0.850	0.944	0.834	0.795	0.929	0.826
CheXGAT^［23］	0.787	0.879	0.837	0.699	0.839	0.793	0.741	0.879	0.755	0.851	0.945	0.842	0.794	0.931	0.827
SSGE^［22］	0.792	0.892	0.840	0.714	0.848	0.812	0.733	0.885	0.753	0.848	0.948	0.827	0.795	0.932	0.830
本文方法	0.823	0.887	0.873	0.703	0.850	0.794	0.741	0.877	0.798	0.882	0.924	0.837	0.773	0.917	0.834

方法	Cons_1	Cons_2	Cons_3	Cons_4	ILD	Emph	Atel	AB	Bron	Pne2	Effu	平均
CheXNet^［7］	0.698	0.640	0.687	0.821	0.894	0.815	0.866	0.884	0.821	0.716	0.907	0.795
CRAL^［21］	0.681	0.664	0.697	0.823	0.897	0.805	0.816	0.864	0.882	0.768	0.888	0.799
CAN^［19］	0.708	0.677	0.686	0.838	0.943	0.806	0.876	0.869	0.827	0.754	0.926	0.810
DualCheXNet^［14］	0.730	0.701	0.691	0.853	0.931	0.833	0.882	0.872	0.864	0.761	0.928	0.822
A³Net^［13］	0.679	0.675	0.709	0.857	0.883	0.794	0.888	0.889	0.896	0.769	0.924	0.815
CheXGCN^［11］	0.713	0.685	0.693	0.861	0.941	0.793	0.884	0.887	0.879	0.752	0.938	0.821
CheXGAT^［23］	0.709	0.669	0.711	0.859	0.906	0.790	0.909	0.885	0.913	0.773	0.948	0.825
本文方法	0.697	0.693	0.714	0.850	0.923	0.815	0.911	0.881	0.887	0.774	0.940	0.826

方法	Cons_1	Cons_2	Cons_3	Cons_4	ILD	Emph	Atel	AB	Bron	Pne2	Effu	平均
CheXNet^［7］	0.698	0.640	0.687	0.821	0.894	0.815	0.866	0.884	0.821	0.716	0.907	0.795
CRAL^［21］	0.681	0.664	0.697	0.823	0.897	0.805	0.816	0.864	0.882	0.768	0.888	0.799
CAN^［19］	0.708	0.677	0.686	0.838	0.943	0.806	0.876	0.869	0.827	0.754	0.926	0.810
DualCheXNet^［14］	0.730	0.701	0.691	0.853	0.931	0.833	0.882	0.872	0.864	0.761	0.928	0.822
A³Net^［13］	0.679	0.675	0.709	0.857	0.883	0.794	0.888	0.889	0.896	0.769	0.924	0.815
CheXGCN^［11］	0.713	0.685	0.693	0.861	0.941	0.793	0.884	0.887	0.879	0.752	0.938	0.821
CheXGAT^［23］	0.709	0.669	0.711	0.859	0.906	0.790	0.909	0.885	0.913	0.773	0.948	0.825
本文方法	0.697	0.693	0.714	0.850	0.923	0.815	0.911	0.881	0.887	0.774	0.940	0.826

模型	平均AUC
模型	ChestX-ray14	DR-Pneumonia
Model-1	0.825	0.821
Model-2	0.828	0.819
Model-3	0.832	0.822
本文方法	0.834	0.826

基于跨尺度注意力网络的胸部疾病分类方法

Thoracic disease classification method based on cross-scale attention network

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 9

参考文献 29

相关文章 15

编辑推荐

Metrics

池化策略	不同数据集上的平均AUC
池化策略	ChestX-ray14	DR-Pneumonia
GAP	0.822	0.818
SPP	0.830	0.824
SAPP	0.834	0.826

损失函数	不同数据集上的平均AUC
损失函数	ChestX-ray14	DR-Pneumonia
交叉熵损失函数	0.821	0.814
焦点损失函数	0.827	0.817
非对称损失函数	0.834	0.826

[1]	颜承志, 陈颖, 钟凯, 高寒. 基于多尺度网络与轴向注意力的3D目标检测算法[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2537-2545.
[2]	吴海峰, 陶丽青, 程玉胜. 集成特征注意力和残差连接的偏标签回归算法[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2530-2536.
[3]	习怡萌, 邓箴, 刘倩, 刘立波. 跨模态信息融合的视频-文本检索[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2448-2456.
[4]	彭鹏, 蔡子婷, 刘雯玲, 陈才华, 曾维, 黄宝来. 基于CNN和双向GRU混合孪生网络的语音情感识别方法[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2515-2521.
[5]	敬超, 全育涛, 陈艳. 基于多层感知机-注意力模型的功耗预测算法[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2646-2655.
[6]	周金, 李玉芝, 张徐, 高硕, 张立, 盛家川. 复杂电磁环境下的调制识别网络[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2672-2682.
[7]	陶永鹏, 柏诗淇, 周正文. 基于卷积和Transformer神经网络架构搜索的脑胶质瘤多组织分割网络[J]. 《计算机应用》唯一官方网站, 2025, 45(7): 2378-2386.
[8]	梁辰, 王奕森, 魏强, 杜江. 基于Tsransformer-GCN的源代码漏洞检测方法[J]. 《计算机应用》唯一官方网站, 2025, 45(7): 2296-2303.
[9]	刘皓宇, 孔鹏伟, 王耀力, 常青. 基于多视角信息的行人检测算法[J]. 《计算机应用》唯一官方网站, 2025, 45(7): 2325-2332.
[10]	赵小强, 柳勇勇, 惠永永, 刘凯. 基于改进时域卷积网络与多头自注意力机制的间歇过程质量预测模型[J]. 《计算机应用》唯一官方网站, 2025, 45(7): 2245-2252.
[11]	张英俊, 闫薇薇, 谢斌红, 张睿, 陆望东. 梯度区分与特征范数驱动的开放世界目标检测[J]. 《计算机应用》唯一官方网站, 2025, 45(7): 2203-2210.
[12]	王慧斌, 胡展傲, 胡节, 徐袁伟, 文博. 基于分段注意力机制的时间序列预测模型[J]. 《计算机应用》唯一官方网站, 2025, 45(7): 2262-2268.
[13]	陈亮, 王璇, 雷坤. 复杂场景下跨层多尺度特征融合的安全帽佩戴检测算法[J]. 《计算机应用》唯一官方网站, 2025, 45(7): 2333-2341.
[14]	王艺涵, 路翀, 陈忠源. 跨模态文本信息增强的多模态情感分析模型[J]. 《计算机应用》唯一官方网站, 2025, 45(7): 2237-2244.
[15]	颜文婧, 王瑞东, 左敏, 张青川. 基于风味嵌入异构图层次学习的食谱推荐模型[J]. 《计算机应用》唯一官方网站, 2025, 45(6): 1869-1878.