Thoracic disease classification method based on cross-scale attention network

doi:10.11772/j.issn.1001-9081.2024071019

Journal of Computer Applications ›› 2025, Vol. 45 ›› Issue (8): 2712-2719.DOI: 10.11772/j.issn.1001-9081.2024071019

• Multimedia computing and computer simulation • Previous Articles

Thoracic disease classification method based on cross-scale attention network

Jinhao LIN¹, Chuan LUO¹(), Tianrui LI², Hongmei CHEN²

^1.College of Computer Science，Sichuan University，Chengdu Sichuan 610065，China
^2.School of Computing and Artificial Intelligence，Southwest Jiaotong University，Chengdu Sichuan 611756，China

Received:2024-07-19 Revised:2024-11-05 Accepted:2024-11-05 Online:2024-12-03 Published:2025-08-10
Contact: Chuan LUO
About author:LIN Jinhao， born in 1999， M. S. candidate. His research interests include deep learning， medical image processing.
LI Tianrui， born in 1969， Ph. D.， professor. His research interests include artificial intelligence， data mining， knowledge discovery.
CHEN Hongmei， born in 1971， Ph. D.， professor. Her research interests include data mining， granular computing.
Supported by:
National Natural Science Foundation of China(62476182);Natural Science Foundation of Sichuan Province(2022NSFSC0898)

基于跨尺度注意力网络的胸部疾病分类方法

林进浩¹, 罗川¹(), 李天瑞², 陈红梅²

^1.四川大学计算机学院，成都 610065
^2.西南交通大学计算机与人工智能学院，成都 611756

通讯作者: 罗川
作者简介:林进浩（1999—），男，广东阳江人，硕士研究生，主要研究方向：深度学习、医学图像处理
李天瑞（1969—），男，福建莆田人，教授，博士，主要研究方向：人工智能、数据挖掘、知识发现
陈红梅（1971—），女，四川成都人，教授，博士，主要研究方向：数据挖掘、粒计算。
基金资助:
国家自然科学基金资助项目(62476182);国家自然科学基金资助项目(62076171);国家自然科学基金资助项目(62376230);四川省自然科学基金资助项目(2022NSFSC0898)

Abstract

Abstract:

Automatic identification of thoracic diseases from chest X-rays is a significant area of research in computer-aided diagnosis. However， many existing methods for thoracic disease classification struggle to handle differences in lesion area sizes and often fail to identify and localize the lesion areas of different diseases accurately. To address the above problems， a thoracic disease classification method based on Cross-scale Attention Network （CANet） was proposed. In this method， DenseNet-121 was employed as the feature extraction network， and three main modules were integrated： Self Aware Attention （SAA）， Upward Focus Attention （UFA）， and Downward Guidance Attention （DGA） modules. In the SAA module， the spatial pathological features were refined and the interference from irrelevant areas was reduced by extracting channel and abnormal area information relevant to thoracic diseases. In order to achieve cross-scale interaction of spatial context information of different scales， image feature calibration was performed using the UFA and DGA modules. Additionally， the Spatial Attention Pyramid Pooling （SAPP） module was proposed to fuse multi-scale features from different feature maps， thereby enhancing the detection performance for thoracic diseases. Experimental results on ChestX-ray14 and DR-Pneumonia datasets show that the proposed method has the average Area Under Curve （AUC） values of 83.4% and 82.6%， respectively， outperforming DualCheXNet， A³Net， and CheXGAT methods. Specifically， compared with CheXGAT method， the proposed method improves the average AUC values by 0.7 and 0.1 percentage points， respectively. It can be seen that the proposed method identifies critical information in chest X-rays effectively， improving the performance of thoracic disease classification significantly.

Key words: thoracic disease classification, chest X-ray, attention mechanism, Convolutional Neural Network (CNN), feature fusion

摘要：

从胸部X光片中自动识别胸部疾病是计算机辅助诊断的重要研究领域。然而，现有的许多胸部疾病分类方法在处理病变区域大小差异方面存在困难，并且无法准确识别和定位不同疾病的病变区域。针对上述问题，提出一种基于跨尺度注意力网络（CANet）的胸部疾病分类方法。该方法使用DenseNet-121作为特征提取网络，并集成自感知注意力（SAA）、向上聚焦注意力（UFA）和向下引导注意力（DGA）3个主要模块。SAA模块通过提取与胸部疾病相关的通道和异常区域信息，细化空间位置上的病理特征，并减少不相关区域的干扰。为了实现不同尺度空间上下文信息的跨尺度交互，使用UFA和DGA模块进行图像特征校准。此外，提出空间注意力金字塔池化（SAPP）模块用于融合不同特征图的多尺度特征，从而提高胸部疾病的检测性能。在ChestX-ray14和DR-Pneumonia数据集上的实验结果表明，所提方法的平均曲线下面积（AUC）值分别达到了83.4%和82.6%，优于DualCheXNet、A³Net和CheXGAT等方法。具体地，与CheXGAT方法相比，所提方法的平均AUC值分别提高了0.7和0.1个百分点。可见，所提方法可以识别胸部X光片中的重要信息，有效提高了胸部疾病分类的性能。

关键词: 胸部疾病分类, 胸部X光片, 注意力机制, 卷积神经网络, 特征融合

CLC Number:

TP391

Jinhao LIN, Chuan LUO, Tianrui LI, Hongmei CHEN. Thoracic disease classification method based on cross-scale attention network[J]. Journal of Computer Applications, 2025, 45(8): 2712-2719.

林进浩, 罗川, 李天瑞, 陈红梅. 基于跨尺度注意力网络的胸部疾病分类方法[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2712-2719.

Figures/Tables 9

References 29

[1]	MA X， ZHU L， KURCHE J S， et al. Global and regional burden of interstitial lung disease and pulmonary sarcoidosis from 1990 to 2019： results from the Global Burden of Disease study 2019［J］. Thorax， 2022， 77（6）： 596-605.
[2]	BRUNO M A， WALKER E A， ABUJUDEH H H. Understanding and confronting our mistakes： the epidemiology of error in radiology and strategies for error reduction［J］. RadioGraphics， 2015， 35（6）： 1668-1676.
[3]	WANG X， PENG Y， LU L， et al. ChestX-ray8： hospital-scale chest X-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 3462-3471.
[4]	RONNEBERGER O， FISCHER P， BROX T. U-Net： convolutional networks for biomedical image segmentation［C］// Proceedings of the 2015 International Conference on Medical Image Computing and Computer-Assisted Intervention， LNCS 9351. Cham： Springer， 2015： 234-241.
[5]	HUYNH L D， BOUTRY N. A U-net++ with pre-trained EfficientNet backbone for segmentation of diseases and artifacts in endoscopy images and videos［EB/OL］.［2024-10-11］. .
[6]	HE K， GKIOXARI G， DOLLÁR P， et al. Mask R-CNN［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017： 2980-2988.
[7]	RAJPURKAR P， IRVIN J， ZHU K， et al. CheXNet： radiologist-level pneumonia detection on chest X-rays with deep learning［EB/OL］. ［2024-05-23］..
[8]	KARADDI S H， SHARMA L D. Automated multi-class classification of lung diseases from CXR-images using pre-trained convolutional neural networks［J］. Expert Systems with Applications， 2023， 211： No.118650.
[9]	CHEN B， LI J， LU G， et al. Lesion location attention guided network for multi-label thoracic disease classification in chest X-rays［J］. IEEE Journal of Biomedical and Health Informatics， 2020， 24（7）： 2016-2027.
[10]	宋子岩，罗川，李天瑞，等. 基于注意力机制和双分支网络的胸部疾病分类［J］. 计算机科学， 2024， 51（11A）： No.230900116.
	SONG Z Y， LUO C， LI T R， et al. Classification of thoracic diseases based on attention mechanisms and two-branch networks［J］. Computer Science， 2024， 51（11A）： No.230900116.
[11]	CHEN B， LI J， LU G， et al. Label co-occurrence learning with graph convolutional networks for multi-label chest X-ray image classification［J］. IEEE Journal of Biomedical and Health Informatics， 2020， 24（8）： 2292-2302.
[12]	INDUMATHI V， SIVA R. An efficient lung disease classification from X-ray images using hybrid Mask-RCNN and BiDLSTM［J］. Biomedical Signal Processing and Control， 2023， 81： No.104340.
[13]	WANG H， WANG S， QIN Z， et al. Triple attention learning for classification of 14 thoracic diseases using chest radiography［J］. Medical Image Analysis， 2021， 67： No.101846.
[14]	CHEN B， LI J， GUO X， et al. DualCheXNet： dual asymmetric feature learning for thoracic disease classification in chest X-rays［J］. Biomedical Signal Processing and Control， 2019， 53： No.101554.
[15]	LIU H， WANG L， NAN Y， et al. SDFN： segmentation-based deep fusion network for thoracic disease classification in chest X-ray images［J］. Computerized Medical Imaging and Graphics， 2019， 75： 66-73.
[16]	FAISAL M， DARMAWAN J T， BACHROIN N， et al. CheXViT： CheXNet and Vision Transformer to multi-label chest X-ray image classification［C］// Proceedings of the 2023 IEEE International Symposium on Medical Measurements and Applications. Piscataway： IEEE， 2023： 1-6.
[17]	WANG H， JIA H， LU L， et al. Thorax-Net： an attention regularized deep neural network for classification of thoracic diseases on chest radiography［J］. IEEE Journal of Biomedical and Health Informatics， 2020， 24（2）： 475-485.
[18]	XU Y， LAM H K， JIA G. MANet： a two-stage deep learning method for classification of COVID-19 from chest X-ray images［J］. Neurocomputing， 2021， 443： 96-105.
[19]	MA C， WANG H， HOI S C H. Multi-label thoracic disease image classification with cross-attention networks［C］// Proceedings of the 2019 International Conference on Medical Image Computing and Computer-Assisted Intervention， LNCS 11769. Cham： Springer， 2019： 730-738.
[20]	SUN Z， QU L， LUO J， et al. Label correlation transformer for automated chest X-ray diagnosis with reliable interpretability［J］. La Radiologia Medica， 2023， 128（6）： 726-733.
[21]	GUAN Q， HUANG Y. Multi-label chest X-ray image classification via category-wise residual attention learning［J］. Pattern Recognition Letters， 2020， 130： 259-266.
[22]	CHEN B， ZHANG Z， LI Y， et al. Multi-label chest X-ray image classification via semantic similarity graph embedding［J］. IEEE Transactions on Circuits and Systems for Video Technology， 2022， 32（4）： 2455-2468.
[23]	LEE Y W， HUANG S K， CHANG R F. CheXGAT： a disease correlation-aware network for thorax disease diagnosis from chest X-ray images［J］. Artificial Intelligence in Medicine， 2022， 132： No.102382.
[24]	WANG Q， WU B， ZHU P， et al. ECA-Net： efficient channel attention for deep convolutional neural networks［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 11531-11539.
[25]	HU J， SHEN L， SUN G. Squeeze-and-excitation networks［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 7132-7141.
[26]	RIDNIK T， BEN-BARUCH E， ZAMIR N， et al. Asymmetric loss for multi-label classification［C］// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2021： 82-91.
[27]	HE K， ZHANG X， REN S， et al. Spatial pyramid pooling in deep convolutional networks for visual recognition［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2015， 37（9）： 1904-1916.
[28]	LIN T Y， GOYAL P， GIRSHICK R， et al. Focal loss for dense object detection［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017： 2999-3007.
[29]	SELVARAJU R R， COGSWELL M， DAS A， et al. Grad-CAM： visual explanations from deep networks via gradient-based localization［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017： 618-626.

方法	Atel	Card	Effu	Infi	Mass	Nodu	Pne1	Pne2	Cons	Edema	Emph	Fibr	PT	Hernia	平均
CheXNet^［7］	0.769	0.885	0.825	0.694	0.824	0.759	0.715	0.852	0.745	0.842	0.906	0.821	0.766	0.901	0.807
SDFN^［15］	0.781	0.885	0.832	0.700	0.815	0.765	0.719	0.866	0.743	0.842	0.921	0.835	0.791	0.911	0.815
CRAL^［21］	0.781	0.880	0.829	0.702	0.834	0.773	0.729	0.857	0.754	0.850	0.908	0.830	0.778	0.917	0.816
CAN^［19］	0.777	0.894	0.829	0.696	0.838	0.771	0.722	0.862	0.750	0.846	0.908	0.827	0.779	0.934	0.817
DualCheXNet^［14］	0.784	0.888	0.831	0.705	0.838	0.796	0.727	0.876	0.746	0.852	0.942	0.837	0.796	0.912	0.823
LLAGnet^［9］	0.783	0.885	0.834	0.703	0.841	0.790	0.729	0.877	0.754	0.851	0.939	0.832	0.798	0.916	0.824
A³Net^［13］	0.779	0.895	0.836	0.710	0.834	0.777	0.737	0.878	0.759	0.855	0.933	0.838	0.791	0.938	0.826
CheXGCN^［11］	0.786	0.893	0.832	0.699	0.840	0.800	0.739	0.876	0.751	0.850	0.944	0.834	0.795	0.929	0.826
CheXGAT^［23］	0.787	0.879	0.837	0.699	0.839	0.793	0.741	0.879	0.755	0.851	0.945	0.842	0.794	0.931	0.827
SSGE^［22］	0.792	0.892	0.840	0.714	0.848	0.812	0.733	0.885	0.753	0.848	0.948	0.827	0.795	0.932	0.830
本文方法	0.823	0.887	0.873	0.703	0.850	0.794	0.741	0.877	0.798	0.882	0.924	0.837	0.773	0.917	0.834

方法	Atel	Card	Effu	Infi	Mass	Nodu	Pne1	Pne2	Cons	Edema	Emph	Fibr	PT	Hernia	平均
CheXNet^［7］	0.769	0.885	0.825	0.694	0.824	0.759	0.715	0.852	0.745	0.842	0.906	0.821	0.766	0.901	0.807
SDFN^［15］	0.781	0.885	0.832	0.700	0.815	0.765	0.719	0.866	0.743	0.842	0.921	0.835	0.791	0.911	0.815
CRAL^［21］	0.781	0.880	0.829	0.702	0.834	0.773	0.729	0.857	0.754	0.850	0.908	0.830	0.778	0.917	0.816
CAN^［19］	0.777	0.894	0.829	0.696	0.838	0.771	0.722	0.862	0.750	0.846	0.908	0.827	0.779	0.934	0.817
DualCheXNet^［14］	0.784	0.888	0.831	0.705	0.838	0.796	0.727	0.876	0.746	0.852	0.942	0.837	0.796	0.912	0.823
LLAGnet^［9］	0.783	0.885	0.834	0.703	0.841	0.790	0.729	0.877	0.754	0.851	0.939	0.832	0.798	0.916	0.824
A³Net^［13］	0.779	0.895	0.836	0.710	0.834	0.777	0.737	0.878	0.759	0.855	0.933	0.838	0.791	0.938	0.826
CheXGCN^［11］	0.786	0.893	0.832	0.699	0.840	0.800	0.739	0.876	0.751	0.850	0.944	0.834	0.795	0.929	0.826
CheXGAT^［23］	0.787	0.879	0.837	0.699	0.839	0.793	0.741	0.879	0.755	0.851	0.945	0.842	0.794	0.931	0.827
SSGE^［22］	0.792	0.892	0.840	0.714	0.848	0.812	0.733	0.885	0.753	0.848	0.948	0.827	0.795	0.932	0.830
本文方法	0.823	0.887	0.873	0.703	0.850	0.794	0.741	0.877	0.798	0.882	0.924	0.837	0.773	0.917	0.834

方法	Cons_1	Cons_2	Cons_3	Cons_4	ILD	Emph	Atel	AB	Bron	Pne2	Effu	平均
CheXNet^［7］	0.698	0.640	0.687	0.821	0.894	0.815	0.866	0.884	0.821	0.716	0.907	0.795
CRAL^［21］	0.681	0.664	0.697	0.823	0.897	0.805	0.816	0.864	0.882	0.768	0.888	0.799
CAN^［19］	0.708	0.677	0.686	0.838	0.943	0.806	0.876	0.869	0.827	0.754	0.926	0.810
DualCheXNet^［14］	0.730	0.701	0.691	0.853	0.931	0.833	0.882	0.872	0.864	0.761	0.928	0.822
A³Net^［13］	0.679	0.675	0.709	0.857	0.883	0.794	0.888	0.889	0.896	0.769	0.924	0.815
CheXGCN^［11］	0.713	0.685	0.693	0.861	0.941	0.793	0.884	0.887	0.879	0.752	0.938	0.821
CheXGAT^［23］	0.709	0.669	0.711	0.859	0.906	0.790	0.909	0.885	0.913	0.773	0.948	0.825
本文方法	0.697	0.693	0.714	0.850	0.923	0.815	0.911	0.881	0.887	0.774	0.940	0.826

方法	Cons_1	Cons_2	Cons_3	Cons_4	ILD	Emph	Atel	AB	Bron	Pne2	Effu	平均
CheXNet^［7］	0.698	0.640	0.687	0.821	0.894	0.815	0.866	0.884	0.821	0.716	0.907	0.795
CRAL^［21］	0.681	0.664	0.697	0.823	0.897	0.805	0.816	0.864	0.882	0.768	0.888	0.799
CAN^［19］	0.708	0.677	0.686	0.838	0.943	0.806	0.876	0.869	0.827	0.754	0.926	0.810
DualCheXNet^［14］	0.730	0.701	0.691	0.853	0.931	0.833	0.882	0.872	0.864	0.761	0.928	0.822
A³Net^［13］	0.679	0.675	0.709	0.857	0.883	0.794	0.888	0.889	0.896	0.769	0.924	0.815
CheXGCN^［11］	0.713	0.685	0.693	0.861	0.941	0.793	0.884	0.887	0.879	0.752	0.938	0.821
CheXGAT^［23］	0.709	0.669	0.711	0.859	0.906	0.790	0.909	0.885	0.913	0.773	0.948	0.825
本文方法	0.697	0.693	0.714	0.850	0.923	0.815	0.911	0.881	0.887	0.774	0.940	0.826

模型	平均AUC
模型	ChestX-ray14	DR-Pneumonia
Model-1	0.825	0.821
Model-2	0.828	0.819
Model-3	0.832	0.822
本文方法	0.834	0.826

Thoracic disease classification method based on cross-scale attention network

基于跨尺度注意力网络的胸部疾病分类方法

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 9

References 29

Related Articles 15

Recommended Articles

Metrics

池化策略	不同数据集上的平均AUC
池化策略	ChestX-ray14	DR-Pneumonia
GAP	0.822	0.818
SPP	0.830	0.824
SAPP	0.834	0.826

损失函数	不同数据集上的平均AUC
损失函数	ChestX-ray14	DR-Pneumonia
交叉熵损失函数	0.821	0.814
焦点损失函数	0.827	0.817
非对称损失函数	0.834	0.826

[1]	Peng PENG, Ziting CAI, Wenling LIU, Caihua CHEN, Wei ZENG, Baolai HUANG. Speech emotion recognition method based on hybrid Siamese network with CNN and bidirectional GRU [J]. Journal of Computer Applications, 2025, 45(8): 2515-2521.
[2]	Chao JING, Yutao QUAN, Yan CHEN. Improved multi-layer perceptron and attention model-based power consumption prediction algorithm [J]. Journal of Computer Applications, 2025, 45(8): 2646-2655.
[3]	Chengzhi YAN, Ying CHEN, Kai ZHONG, Han GAO. 3D object detection algorithm based on multi-scale network and axial attention [J]. Journal of Computer Applications, 2025, 45(8): 2537-2545.
[4]	Haifeng WU, Liqing TAO, Yusheng CHENG. Partial label regression algorithm integrating feature attention and residual connection [J]. Journal of Computer Applications, 2025, 45(8): 2530-2536.
[5]	Jin ZHOU, Yuzhi LI, Xu ZHANG, Shuo GAO, Li ZHANG, Jiachuan SHENG. Modulation recognition network for complex electromagnetic environments [J]. Journal of Computer Applications, 2025, 45(8): 2672-2682.
[6]	Yimeng XI, Zhen DENG, Qian LIU, Libo LIU. Cross-modal information fusion for video-text retrieval [J]. Journal of Computer Applications, 2025, 45(8): 2448-2456.
[7]	Haoyu LIU, Pengwei KONG, Yaoli WANG, Qing CHANG. Pedestrian detection algorithm based on multi-view information [J]. Journal of Computer Applications, 2025, 45(7): 2325-2332.
[8]	Xiaoqiang ZHAO, Yongyong LIU, Yongyong HUI, Kai LIU. Batch process quality prediction model using improved time-domain convolutional network with multi-head self-attention mechanism [J]. Journal of Computer Applications, 2025, 45(7): 2245-2252.
[9]	Yingjun ZHANG, Weiwei YAN, Binhong XIE, Rui ZHANG, Wangdong LU. Gradient-discriminative and feature norm-driven open-world object detection [J]. Journal of Computer Applications, 2025, 45(7): 2203-2210.
[10]	Huibin WANG, Zhan’ao HU, Jie HU, Yuanwei XU, Bo WEN. Time series forecasting model based on segmented attention mechanism [J]. Journal of Computer Applications, 2025, 45(7): 2262-2268.
[11]	Liang CHEN, Xuan WANG, Kun LEI. Helmet wearing detection algorithm for complex scenarios based on cross-layer multi-scale feature fusion [J]. Journal of Computer Applications, 2025, 45(7): 2333-2341.
[12]	Yongpeng TAO, Shiqi BAI, Zhengwen ZHOU. Neural architecture search for multi-tissue segmentation using convolutional and transformer-based networks in glioma segmentation [J]. Journal of Computer Applications, 2025, 45(7): 2378-2386.
[13]	Chen LIANG, Yisen WANG, Qiang WEI, Jiang DU. Source code vulnerability detection method based on Transformer-GCN [J]. Journal of Computer Applications, 2025, 45(7): 2296-2303.
[14]	Yihan WANG, Chong LU, Zhongyuan CHEN. Multimodal sentiment analysis model with cross-modal text information enhancement [J]. Journal of Computer Applications, 2025, 45(7): 2237-2244.
[15]	Zonghang WU, Dong ZHANG, Guanyu LI. Multimodal fusion recommendation algorithm based on joint self-supervised learning [J]. Journal of Computer Applications, 2025, 45(6): 1858-1868.