基于判别区域引导的多视图困难气道识别

doi:10.11772/j.issn.1001-9081.2024101404

《计算机应用》唯一官方网站 ›› 2025, Vol. 45 ›› Issue (10): 3399-3406.DOI: 10.11772/j.issn.1001-9081.2024101404

• 前沿与综合应用 • 上一篇

基于判别区域引导的多视图困难气道识别

吴松霖¹, 张广朝², 姚远³, 彭博¹()

^1.西南交通大学计算机与人工智能学院，成都 611756
^2.四川大学华西医院麻醉科，成都 610044
^3.四川大学华西医院全科医学科，成都 610044

收稿日期:2024-10-07 修回日期:2025-01-08 接受日期:2025-01-09 发布日期:2025-01-13 出版日期:2025-10-10
通讯作者: 彭博
作者简介:吴松霖（1999—），男，山东莱西人，硕士研究生，CCF会员，主要研究方向：深度学习、计算机视觉
张广朝（1992—），男，河南新乡人，主要研究方向：困难气道管理、神经阻滞麻醉及术后谵妄
姚远（1991—），男，河南新乡人，硕士，主要研究方向：全科医学、医学人工智能
彭博（1980—），女，四川成都人，教授，博士，CCF会员，主要研究方向：计算机视觉、模式识别。Email:bpeng@swjtu.edu.cn
基金资助:
四川省科技计划项目(2024YFHZ0059)

Multi-view difficult airway recognition based on discriminant region guidance

Songlin WU¹, Guangchao ZHANG², Yuan YAO³, Bo PENG¹()

^1.School of Computing and Artificial Intelligence，Southwest Jiaotong University，Chengdu Sichuan 611756，China
^2.Anesthesia and Operation Center，West China Hospital of Sichuan University，Chengdu Sichuan 610044，China
^3.Division of General Practice，West China Hospital of Sichuan University，Chengdu Sichuan 610044，China

Received:2024-10-07 Revised:2025-01-08 Accepted:2025-01-09 Online:2025-01-13 Published:2025-10-10
Contact: Bo PENG
About author:WU Songlin， born in 1999， M. S. candidate. His research interests include deep learning， computer vision.
ZHANG Guangchao， born in 1992. His research interests include difficult airway management， nerve block anesthesia and postoperative delirium.
YAO Yuan， born in 1991， M. S. His research interests include general medicine， medical artificial intelligence.
PENG Bo， born in 1980， Ph. D.， professor. Her research interests include computer vision， pattern recognition.
Supported by:
Sichuan Science and Technology Program(2024YFHZ0059)

摘要/Abstract

摘要：

困难气道（DA）是临床手术中关键的术前风险因素，但它的准确识别面临诸多挑战，如数据集规模小、类别严重不平衡和单视图识别能力不足等。针对这些问题，提出多视图DA识别模型——DRG-MV-Net（Discriminative Region Guided Multi-View Net）。在模型的第一阶段，判别区域引导模块（DRGM）借助类激活映射（CAM）自动检测并强调面部视图中的关键判别区域，生成2种具有特定特征的数据增强图像；在模型的第二阶段，使用集成扩张卷积块注意模块（D-CBAM）的ResNet-18骨干网络提取每个视图的特征，再通过多视图交叉融合模块（MCFM）进行多视图特征集成。此外，将Focal Loss与分层混合采样相结合，缓解类别不平衡问题。对所构建的临床数据集的评估结果显示，所提模型实现了77.22%的几何平均准确率（G-Mean）、43.88%的F1分数（F1-Score）、38.73%的马修斯相关系数（MCC）和0.740 7的受试者操作特征曲线下面积（AUC）。与近期的DA识别模型MCE-Net（Multi-view Contrastive representation prior and Ensemble classification Network）相比，所提模型的G-Mean、F1-Score和MCC分别提升了2.41、2.34和3.41个百分点；与基线模型ResNet-18相比，分别提升了4.85、6.85和8.25个百分点。以上结果验证了所提模型在小型且不平衡数据集上DA识别的有效性，为解决复杂的DA识别提供了新的见解和方法。

关键词: 困难气道识别, 多视图学习, 数据增强, 类别数量不平衡, 特征融合, 注意力机制

Abstract:

Difficult Airway （DA） is a critical preoperative risk factor in clinical surgery， and its accurate recognition faces numerous challenges， such as small dataset size， severe class imbalance， and insufficient single-view recognition capability. Aiming at these issues， a multi-view DA recognition model， DRG-MV-Net （Discriminative Region Guided Multi-View Net）， was proposed. In the first stage of the model， the Discriminative Region Guidance Module （DRGM） was employed to detect and emphasize key discriminative regions in facial views automatically using Class Activation Mapping （CAM）， thereby generating two types of data augmented images with specific features. In the second stage of the model， features of each view were extracted using ResNet-18 backbone network integrating Dilated-Convolution Block Attention Module （D-CBAM）， and multi-view feature integration was performed via the Multi-View Cross Fusion Module （MCFM）. Besides， Focal Loss and layered hybrid sampling were combined to mitigate the class imbalance phenomenon. Evaluated results on the constructed clinical dataset demonstrate that the proposed model achieves a G-Mean of 77.22%， an F1-Score of 43.88%， a Matthews Correlation Coefficient （MCC） of 38.73%， and an Area Under the receiver operating Characteristic curve （AUC） of 0.740 7. Compared with the recent DA recognition model MCE-Net （Multi-view Contrastive representation prior and Ensemble classification Network）， the proposed model has the G-Mean， F1-Score， and MCC improved by 2.41， 2.34， and 3.41 percentage points， respectively； compared with the baseline model ResNet-18， the proposed model has these metrics improved by 4.85， 6.85， and 8.25 percentage points， respectively， verifying the effectiveness of the proposed model in DA recognition on small， imbalanced datasets and providing new insights and methods for solving complex DA recognition.

Key words: Difficult Airway (DA) recognition, multi-view learning, data augmentation, category number imbalance, feature fusion, attention mechanism

中图分类号:

TP391.4

吴松霖, 张广朝, 姚远, 彭博. 基于判别区域引导的多视图困难气道识别[J]. 计算机应用, 2025, 45(10): 3399-3406.

Songlin WU, Guangchao ZHANG, Yuan YAO, Bo PENG. Multi-view difficult airway recognition based on discriminant region guidance[J]. Journal of Computer Applications, 2025, 45(10): 3399-3406.

图/表 11

参考文献 37

[1]	COOK T M， WOODALL N， FRERK C， et al. Major complications of airway management in the UK： results of the Fourth National Audit Project of the Royal College of Anaesthetists and the Difficult Airway Society. part 1： anaesthesia［J］. British Journal of Anaesthesia， 2011， 106（5）： 617-631.
[2]	PANDIT J J， ANDRADE J， BOGOD D G， et al. 5th National Audit Project （NAP5） on accidental awareness during general anaesthesia： summary of main findings and risk factors［J］. British Journal of Anaesthesia， 2014， 113（4）： 540-548.
[3]	CARSETTI A， SORBELLO M， ADRARIO E， et al. Airway ultrasound as predictor of difficult direct laryngoscopy： a systematic review and meta-analysis［J］. Anesthesia and Analgesia， 2022， 134（4）： 740-750.
[4]	AMANITI A， PAPAKONSTANTINOU P， GKINAS D， et al. Comparison of laryngoscopic views between C-MAC™ and conventional laryngoscopy in patients with multiple preoperative prognostic criteria of difficult intubation. an observational cross-sectional study［J］. Medicina， 2019， 55（12）： No.760.
[5]	王杰，夏明，周韧，等. 可视喉镜暴露声门困难相关面部特征分析［J］. 第二军医大学学报， 2021， 42（12）：1382-1387.
	WANG J， XIA M， ZHOU R， et al. Analysis of facial features related to difficult visual laryngoscopic glottis exposure［J］. Academic Journal of the Second Military Medical University， 2021， 42（12）：1382-1387.
[6]	HEIDEGGER T. Management of the difficult airway［J］. The New England Journal of Medicine， 2021， 384（19）： 1836-1847.
[7]	LAW J A， BROEMLING N， COOPER R M， et al. The difficult airway with recommendations for management — part 2 — the anticipated difficult airway［J］. Canadian Journal of Anaesthesia， 2013， 60（11）： 1119-1138.
[8]	APFELBAUM J L， HAGBERG C A， CONNIS R T， et al. 2022 American Society of Anesthesiologists practice guidelines for management of the difficult airway［J］. Anesthesiology， 2022， 136（1）： 31-81.
[9]	CONNOR C W， SEGAL S. Accurate classification of difficult intubation by computerized facial analysis［J］. Anesthesia and Analgesia， 2011， 112（1）： 84-93.
[10]	CONNOR C W， SEGAL S. The importance of subjective facial appearance on the ability of anesthesiologists to predict difficult intubation［J］. Anesthesia and Analgesia， 2014， 118（2）： 419-427.
[11]	CUENDET G L， SCHOETTKER P， YÜCE A， et al. Facial image analysis for fully automatic prediction of difficult endotracheal intubation［J］. IEEE Transactions on Biomedical Engineering， 2016， 63（2）： 328-339.
[12]	HAYASAKA T， KAWANO K， KURIHARA K， et al. Creation of an artificial intelligence model for intubation difficulty classification by deep learning （convolutional neural network） using face images： an observational study［J］. Journal of Intensive Care， 2021， 9： No.38.
[13]	SIMONYAN K， ZISSERMAN A. Very deep convolutional networks for large-scale image recognition［EB/OL］. ［2024-04-18］..
[14]	SELVARAJU R R， COGSWELL M， DAS A， et al. Grad-CAM： visual explanations from deep networks via gradient-based localization［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017： 618-626.
[15]	LIN Q， CHNG C B， TOO J， et al. Towards artificial intelligence-enabled medical pre-operative airway assessment［C］// Proceedings of the 2022 IEEE International Conference on E-health Networking， Application and Services. Piscataway： IEEE， 2022： 69-74.
[16]	TAVOLARA T E， GURCAN M N， SEGAL S， et al. Identification of difficult to intubate patients from frontal face images using an ensemble of deep learning models［J］. Computers in Biology and Medicine， 2021， 136： No.104737.
[17]	GARCÍA-GARCÍA F， LEE D J， MENDOZA-GARCÉS F J， et al. Automated location of orofacial landmarks to characterize airway morphology in anaesthesia via deep convolutional neural networks［J］. Computer Methods and Programs in Biomedicine， 2023， 232： No.107428.
[18]	GARCÍA-GARCÍA F， LEE D J， MENDOZA-GARCÉS F J， et al. Reliable prediction of difficult airway for tracheal intubation from patient preoperative photographs by machine learning methods［J］. Computer Methods and Programs in Biomedicine， 2024， 248： No.108118.
[19]	WANG G， LI C， TANG F， et al. A fully-automatic semi-supervised deep learning model for difficult airway assessment［J］. Heliyon， 2023， 9（5）： No.e15629.
[20]	XIA M， JIN C， ZHENG Y， et al. Deep learning-based facial analysis for predicting difficult videolaryngoscopy： a feasibility study［J］. Anaesthesia， 2024， 79（4）： 399-409.
[21]	HE K， ZHANG X， REN S， et al. Deep residual learning for image recognition［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 770-778.
[22]	WU J， YAO Y， ZHANG G， et al. Difficult airway assessment based on multi-view metric learning［J］. Bioengineering， 2024， 11（7）： No.703.
[23]	LI X， PENG B， YAO Y， et al. Difficult airway assessment with multi-view contrastive representation prior and ensemble classification［J］. Biomedical Signal Processing and Control， 2024， 98： No.106738.
[24]	HO J， SALIMANS T. Classifier-free diffusion guidance［EB/OL］. ［2024-04-18］..
[25]	LIN T Y， GOYAL P， GIRSHICK R， et al. Focal loss for dense object detection［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017： 2999-3007.
[26]	DENG J， DONG W， SOCHER R， et al. ImageNet： a large-scale hierarchical image database［C］// Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2009： 248-255.
[27]	KRIZHEVSKY A， SUTSKEVER I， HINTON G E. ImageNet classification with deep convolutional neural networks ［C］// Proceedings of the 25th International Conference on Neural Information Processing Systems — Volume 1. Red Hook： Curran Associates Inc.， 2012： 1097-1105.
[28]	SANDLER M， HOWARD A， ZHU M， et al. MobileNetV2： inverted residuals and linear bottlenecks［C］// Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 4510-4520.
[29]	SZEGEDY C， VANHOUCKE V， IOFFE S， et al. Rethinking the Inception architecture for computer vision［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 2818-2826.
[30]	HUANG G， LIU Z， VAN DER MAATEN L， et al. Densely connected convolutional networks［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 2261-2269.
[31]	YU F， KOLTUN V. Multi-scale context aggregation by dilated convolutions［EB/OL］. ［2024-04-18］..
[32]	WOO S， PARK J， LEE J Y， et al. CBAM： convolutional block attention module［C］// Proceedings of the 2018 European Conference on Computer Vision， LNCS 11211. Cham： Springer， 2018： 3-19.
[33]	VASWANI A， SHAZEER N， PARMAR N， et al. Attention is all you need［C］// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2017： 6000-6010.
[34]	KRAGE R， VAN RIJN C， VAN GROENINGEN D， et al. Cormack-Lehane classification revisited［J］. British Journal of Anaesthesia， 2010， 105（2）： 220-227.
[35]	HOU Q， ZHOU D， FENG J. Coordinate attention for efficient mobile network design［C］// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2021： 13708-13717.
[36]	HU J， SHEN L， SUN G. Squeeze-and-excitation networks［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 7132-7141.
[37]	OUYANG D， HE S， ZHANG G， et al. Efficient multi-scale attention module with cross-spatial learning［C］// Proceedings of the 2023 IEEE International Conference on Acoustics， Speech and Signal Processing. Piscataway： IEEE， 2023： 1-5.

方法	G-Mean	灵敏度	特异性	F1-Score	MCC
PFLD	60.85±6.66	57.78±12.61	64.86±4.27	26.83±5.57	14.92±8.66
文献［12］方法	60.97±6.35	61.11±10.80	61.29±4.69	26.55±4.78	14.53±8.10
文献［19］方法	63.68±8.22	58.89±11.77	69.43±7.38	30.46±9.28	19.62±12.32
文献［20］方法	64.90±4.08	70.00±10.54	60.71±4.27	29.38±3.09	19.78±5.58
DMF-Net	70.18±5.76	77.22±7.86	68.43±6.30	35.06±6.07	27.24±8.35
MCE-Net	74.81±7.25	75.56±11.48	74.71±7.68	41.54±9.04	35.32±11.76
DRG-MV-Net	77.22±4.90	78.89±9.73	76.14±7.03	43.88±6.33	38.73±7.61

方法	G-Mean	灵敏度	特异性	F1-Score	MCC
PFLD	60.85±6.66	57.78±12.61	64.86±4.27	26.83±5.57	14.92±8.66
文献［12］方法	60.97±6.35	61.11±10.80	61.29±4.69	26.55±4.78	14.53±8.10
文献［19］方法	63.68±8.22	58.89±11.77	69.43±7.38	30.46±9.28	19.62±12.32
文献［20］方法	64.90±4.08	70.00±10.54	60.71±4.27	29.38±3.09	19.78±5.58
DMF-Net	70.18±5.76	77.22±7.86	68.43±6.30	35.06±6.07	27.24±8.35
MCE-Net	74.81±7.25	75.56±11.48	74.71±7.68	41.54±9.04	35.32±11.76
DRG-MV-Net	77.22±4.90	78.89±9.73	76.14±7.03	43.88±6.33	38.73±7.61

模型	G-Mean	灵敏度	特异性	F1-Score	MCC
AlexNet	56.45±12.92	53.33±20.82	62.14±4.87	23.74±8.93	10.04±14.63
VGG-19	59.10±11.26	56.67±18.48	63.29±3.30	25.54±8.05	12.98±12.99
VGG-16	64.58±7.29	63.33±10.54	66.14±5.30	30.01±6.21	19.55±9.65
MobileNetV2	61.11±7.05	60.00±15.89	64.00±6.76	27.06±5.10	15.73±8.22
InceptionV3	61.95±7.77	62.22±15.00	62.86±4.99	27.48±5.81	16.29±9.35
DenseNet	61.52±7.91	63.33±12.88	60.29±5.34	27.01±6.22	15.29±10.23
ResNet-50	65.68±5.09	67.78±8.20	63.86±4.21	30.33±4.34	20.63±6.77
ResNet-34	65.94±6.16	65.56±9.73	66.57±3.88	30.98±5.46	21.24±8.30
ResNet-18	67.01±6.88	68.89±8.76	65.29±5.95	31.85±6.22	22.56±9.47
DRG-MV-Net	77.22±4.90	78.89±9.73	76.14±7.03	43.88±6.33	38.73±7.61

模型	G-Mean	灵敏度	特异性	F1-Score	MCC
AlexNet	56.45±12.92	53.33±20.82	62.14±4.87	23.74±8.93	10.04±14.63
VGG-19	59.10±11.26	56.67±18.48	63.29±3.30	25.54±8.05	12.98±12.99
VGG-16	64.58±7.29	63.33±10.54	66.14±5.30	30.01±6.21	19.55±9.65
MobileNetV2	61.11±7.05	60.00±15.89	64.00±6.76	27.06±5.10	15.73±8.22
InceptionV3	61.95±7.77	62.22±15.00	62.86±4.99	27.48±5.81	16.29±9.35
DenseNet	61.52±7.91	63.33±12.88	60.29±5.34	27.01±6.22	15.29±10.23
ResNet-50	65.68±5.09	67.78±8.20	63.86±4.21	30.33±4.34	20.63±6.77
ResNet-34	65.94±6.16	65.56±9.73	66.57±3.88	30.98±5.46	21.24±8.30
ResNet-18	67.01±6.88	68.89±8.76	65.29±5.95	31.85±6.22	22.56±9.47
DRG-MV-Net	77.22±4.90	78.89±9.73	76.14±7.03	43.88±6.33	38.73±7.61

图像组合	G-Mean	灵敏度	特异性	F1-Score	MCC
原始图像	67.01±6.88	68.89±8.76	65.29±5.95	31.85±6.22	22.56±9.47
裁剪	71.22±8.94	72.22±13.09	70.57±6.87	36.77±9.40	29.17±13.05
融合	70.47±7.90	73.33±15.00	68.57±6.87	35.44±7.49	28.06±10.99
原始图像+裁剪	70.93±9.69	73.33±15.89	69.29±6.57	36.14±9.46	28.72±13.55
原始图像+融合	68.30±7.68	67.78±13.30	69.43±4.48	33.57±7.07	24.96±10.28
裁剪+融合	72.37±4.84	74.44±9.15	70.71±5.36	37.30±5.25	30.48±7.21
原图像+裁剪+融合	70.61±5.09	71.11±10.73	70.86±6.87	36.03±5.24	28.52±7.00

基于判别区域引导的多视图困难气道识别

Multi-view difficult airway recognition based on discriminant region guidance

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 11

参考文献 37

相关文章 15

编辑推荐

Metrics

D-CBAM	MCFM	Focal Loss	分层混合采样	G-Mean	灵敏度	特异性	F1-Score	MCC
				72.37±4.84	74.44±9.15	70.71±5.36	37.30±5.25	30.48±7.21
✓				73.00±6.19	75.56±12.61	71.43±8.60	38.59±6.56	32.18±8.91
	✓			71.76±8.27	72.22±13.09	72.00±8.28	37.84±8.94	30.56±11.99
		✓		71.90±7.11	73.33±10.73	70.86±6.77	37.31±7.95	30.11±10.66
			✓	71.43±9.75	70.00±15.76	73.86±8.30	38.24±9.81	30.74±13.49
✓	✓			73.87±5.80	77.78±9.07	70.43±5.87	38.58±6.41	32.45±8.60
✓	✓	✓		72.96±8.18	71.11±11.94	75.29±6.97	39.96±9.53	32.81±12.38
✓	✓		✓	75.03±6.20	73.33±9.37	77.00±5.36	42.15±7.29	35.76±9.49
✓		✓	✓	75.09±5.92	75.56±10.21	75.00±4.91	41.14±6.33	35.09±8.46
	✓	✓	✓	73.27±7.61	76.67±11.05	70.43±8.08	38.59±8.38	32.08±11.28
✓	✓	✓	✓	77.22±4.90	78.89±9.73	76.14±7.03	43.88±6.33	38.73±7.61

[1]	李维刚, 邵佳乐, 田志强. 基于双注意力机制和多尺度融合的点云分类与分割网络[J]. 《计算机应用》唯一官方网站, 2025, 45(9): 3003-3010.
[2]	王翔, 陈志祥, 毛国君. 融合局部和全局相关性的多变量时间序列预测方法[J]. 《计算机应用》唯一官方网站, 2025, 45(9): 2806-2816.
[3]	许志雄, 李波, 边小勇, 胡其仁. 对抗样本嵌入注意力U型网络的3D医学图像分割[J]. 《计算机应用》唯一官方网站, 2025, 45(9): 3011-3016.
[4]	王芳, 胡静, 张睿, 范文婷. 内容引导下多角度特征融合医学图像分割网络[J]. 《计算机应用》唯一官方网站, 2025, 45(9): 3017-3025.
[5]	李莉, 宋涵, 刘培鹤, 陈汉林. 基于数据增强和残差网络的敏感信息命名实体识别[J]. 《计算机应用》唯一官方网站, 2025, 45(9): 2790-2797.
[6]	吕景刚, 彭绍睿, 高硕, 周金. 复频域注意力和多尺度频域增强驱动的语音增强网络[J]. 《计算机应用》唯一官方网站, 2025, 45(9): 2957-2965.
[7]	梁一鸣, 范菁, 柴汶泽. 基于双向交叉注意力的多尺度特征融合情感分类[J]. 《计算机应用》唯一官方网站, 2025, 45(9): 2773-2782.
[8]	王闯, 俞璐, 陈健威, 潘成, 杜文博. 开集域适应综述[J]. 《计算机应用》唯一官方网站, 2025, 45(9): 2727-2736.
[9]	邓伊琳, 余发江. 基于LSTM和可分离自注意力机制的伪随机数生成器[J]. 《计算机应用》唯一官方网站, 2025, 45(9): 2893-2901.
[10]	周金, 李玉芝, 张徐, 高硕, 张立, 盛家川. 复杂电磁环境下的调制识别网络[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2672-2682.
[11]	敬超, 全育涛, 陈艳. 基于多层感知机-注意力模型的功耗预测算法[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2646-2655.
[12]	林进浩, 罗川, 李天瑞, 陈红梅. 基于跨尺度注意力网络的胸部疾病分类方法[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2712-2719.
[13]	习怡萌, 邓箴, 刘倩, 刘立波. 跨模态信息融合的视频-文本检索[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2448-2456.
[14]	颜承志, 陈颖, 钟凯, 高寒. 基于多尺度网络与轴向注意力的3D目标检测算法[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2537-2545.
[15]	吴海峰, 陶丽青, 程玉胜. 集成特征注意力和残差连接的偏标签回归算法[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2530-2536.