《计算机应用》唯一官方网站 ›› 2025, Vol. 45 ›› Issue (10): 3399-3406.DOI: 10.11772/j.issn.1001-9081.2024101404

• 前沿与综合应用 • 上一篇    

基于判别区域引导的多视图困难气道识别

吴松霖1, 张广朝2, 姚远3, 彭博1()   

  1. 1.西南交通大学 计算机与人工智能学院,成都 611756
    2.四川大学华西医院 麻醉科,成都 610044
    3.四川大学华西医院 全科医学科,成都 610044
  • 收稿日期:2024-10-07 修回日期:2025-01-08 接受日期:2025-01-09 发布日期:2025-01-13 出版日期:2025-10-10
  • 通讯作者: 彭博
  • 作者简介:吴松霖(1999—),男,山东莱西人,硕士研究生,CCF会员,主要研究方向:深度学习、计算机视觉
    张广朝(1992—),男,河南新乡人,主要研究方向:困难气道管理、神经阻滞麻醉及术后谵妄
    姚远(1991—),男,河南新乡人,硕士,主要研究方向:全科医学、医学人工智能
    彭博(1980—),女,四川成都人,教授,博士,CCF会员,主要研究方向:计算机视觉、模式识别。Email:bpeng@swjtu.edu.cn
  • 基金资助:
    四川省科技计划项目(2024YFHZ0059)

Multi-view difficult airway recognition based on discriminant region guidance

Songlin WU1, Guangchao ZHANG2, Yuan YAO3, Bo PENG1()   

  1. 1.School of Computing and Artificial Intelligence,Southwest Jiaotong University,Chengdu Sichuan 611756,China
    2.Anesthesia and Operation Center,West China Hospital of Sichuan University,Chengdu Sichuan 610044,China
    3.Division of General Practice,West China Hospital of Sichuan University,Chengdu Sichuan 610044,China
  • Received:2024-10-07 Revised:2025-01-08 Accepted:2025-01-09 Online:2025-01-13 Published:2025-10-10
  • Contact: Bo PENG
  • About author:WU Songlin, born in 1999, M. S. candidate. His research interests include deep learning, computer vision.
    ZHANG Guangchao, born in 1992. His research interests include difficult airway management, nerve block anesthesia and postoperative delirium.
    YAO Yuan, born in 1991, M. S. His research interests include general medicine, medical artificial intelligence.
    PENG Bo, born in 1980, Ph. D., professor. Her research interests include computer vision, pattern recognition.
  • Supported by:
    Sichuan Science and Technology Program(2024YFHZ0059)

摘要:

困难气道(DA)是临床手术中关键的术前风险因素,但它的准确识别面临诸多挑战,如数据集规模小、类别严重不平衡和单视图识别能力不足等。针对这些问题,提出多视图DA识别模型——DRG-MV-Net(Discriminative Region Guided Multi-View Net)。在模型的第一阶段,判别区域引导模块(DRGM)借助类激活映射(CAM)自动检测并强调面部视图中的关键判别区域,生成2种具有特定特征的数据增强图像;在模型的第二阶段,使用集成扩张卷积块注意模块(D-CBAM)的ResNet-18骨干网络提取每个视图的特征,再通过多视图交叉融合模块(MCFM)进行多视图特征集成。此外,将Focal Loss与分层混合采样相结合,缓解类别不平衡问题。对所构建的临床数据集的评估结果显示,所提模型实现了77.22%的几何平均准确率(G-Mean)、43.88%的F1分数(F1-Score)、38.73%的马修斯相关系数(MCC)和0.740 7的受试者操作特征曲线下面积(AUC)。与近期的DA识别模型MCE-Net(Multi-view Contrastive representation prior and Ensemble classification Network)相比,所提模型的G-Mean、F1-Score和MCC分别提升了2.41、2.34和3.41个百分点;与基线模型ResNet-18相比,分别提升了4.85、6.85和8.25个百分点。以上结果验证了所提模型在小型且不平衡数据集上DA识别的有效性,为解决复杂的DA识别提供了新的见解和方法。

关键词: 困难气道识别, 多视图学习, 数据增强, 类别数量不平衡, 特征融合, 注意力机制

Abstract:

Difficult Airway (DA) is a critical preoperative risk factor in clinical surgery, and its accurate recognition faces numerous challenges, such as small dataset size, severe class imbalance, and insufficient single-view recognition capability. Aiming at these issues, a multi-view DA recognition model, DRG-MV-Net (Discriminative Region Guided Multi-View Net), was proposed. In the first stage of the model, the Discriminative Region Guidance Module (DRGM) was employed to detect and emphasize key discriminative regions in facial views automatically using Class Activation Mapping (CAM), thereby generating two types of data augmented images with specific features. In the second stage of the model, features of each view were extracted using ResNet-18 backbone network integrating Dilated-Convolution Block Attention Module (D-CBAM), and multi-view feature integration was performed via the Multi-View Cross Fusion Module (MCFM). Besides, Focal Loss and layered hybrid sampling were combined to mitigate the class imbalance phenomenon. Evaluated results on the constructed clinical dataset demonstrate that the proposed model achieves a G-Mean of 77.22%, an F1-Score of 43.88%, a Matthews Correlation Coefficient (MCC) of 38.73%, and an Area Under the receiver operating Characteristic curve (AUC) of 0.740 7. Compared with the recent DA recognition model MCE-Net (Multi-view Contrastive representation prior and Ensemble classification Network), the proposed model has the G-Mean, F1-Score, and MCC improved by 2.41, 2.34, and 3.41 percentage points, respectively; compared with the baseline model ResNet-18, the proposed model has these metrics improved by 4.85, 6.85, and 8.25 percentage points, respectively, verifying the effectiveness of the proposed model in DA recognition on small, imbalanced datasets and providing new insights and methods for solving complex DA recognition.

Key words: Difficult Airway (DA) recognition, multi-view learning, data augmentation, category number imbalance, feature fusion, attention mechanism

中图分类号: