Journal of Computer Applications

    Next Articles

Multi-view difficult airway recognition based on discriminant region guidance

  

  • Received:2024-10-07 Revised:2025-01-08 Accepted:2025-01-09 Online:2025-01-13 Published:2025-01-13
  • Supported by:
    Sichuan Science and Technology Program

基于判别区域引导的多视图困难气道识别

吴松霖1,张广朝2,姚远3,彭博1   

  1. 1. 西南交通大学 计算机与人工智能学院,成都 611756;2. 四川大学华西医院,麻醉科,成都 610044;
    3. 四川大学华西医院,全科医学中心,成都 610044
  • 通讯作者: 彭博
  • 基金资助:
    四川省科技计划项目

Abstract: Difficult Airway (DA) is a critical preoperative risk factor in clinical surgery, yet its accurate identification faces numerous challenges, such as small dataset size, severe class imbalance, and insufficient single-view recognition capability. To address these limitations, a multi-view DA identification model, DRG-MV-Net (Discriminative Region Guided Multi-View Net), was proposed. In the first stage, the Discriminative Region Guided Module (DRGM) was employed to automatically detect and emphasize key discriminative regions in facial views using Class Activation Mapping (CAM), generating two types of augmented images with specific features. In the second stage, features for each view were extracted using ResNet-18 with the integrated Dilated Convolution Block Attention Module(D-CBAM) as the backbone, and multi-view feature integration was performed via the Multi-View Cross Fusion Module (MCFM). Focal Loss and stratified hybrid sampling were utilized to mitigate the class imbalance problem. Evaluation on the constructed clinical dataset demonstrated that the proposed model achieved a G-Mean of 77.22%, an F1-Score of 43.88%, a Matthews Correlation Coefficient (MCC) of 38.73%, and an Area Under the receiver operating Characteristic curve (AUC) of 0.7407. Compared with the recent method MCE-Net (Multi-view Contrastive representation prior and Ensemble classification Network), the G-Mean, F1-Score, and MCC improved by 2.41, 2.34, and 3.41 percentage points, respectively. Compared with the baseline model ResNet-18, these metrics improved by 4.85, 6.85, and 8.25 percentage points, respectively, verifying the effectiveness of the proposed method in DA identification on small, unbalanced datasets and providing new insights.

Key words: difficult airway recognition, multi-view learning, data augmentation, imbalanced number of categories, feature fusion, attention mechanism

摘要: 困难气道(DA)是临床手术中关键的术前风险因素,但它的准确识别面临诸多挑战,如数据集规模小、类别严重不平衡以及单视图识别能力不足等。为解决这些限制,提出多视图DA识别模型——DRG-MV-Net (Discriminative Region Guided Multi-View Net)。在模型第一阶段,判别区域引导模块(DRGM)借助类激活映射(CAM)自动检测并强调面部视图中的关键判别区域,生成具有特定特征的两种数据增强图像。在第二阶段,以集成扩张卷积块注意模块(D-CBAM)的ResNet-18骨干网络提取每个视图的特征,随后通过多视图交叉融合模块(MCFM)进行多视图特征集成。将Focal Loss与分层混合采样相结合,缓解类别不平衡问题。对所构建的临床数据集评估显示,所提模型实现了77.22%的几何平均准确率(G-Mean)、43.88%的F1-Score、38.73%的马修斯相关系数(MCC)和0.7407的受试者操作特征曲线下面积(AUC)。与近期相关研究的方法MCE-Net(Multi-view Contrastive representation prior and Ensemble classification Network)相比,G-Mean、F1-Score、MCC,分别提升了2.41、2.34和3.41个百分点;与基线模型ResNet-18相比,分别提升了4.85、6.85、8.25个百分点,验证了所提方法在小型、不平衡数据集中DA识别的有效性,为解决复杂的DA识别任务提供了新的见解和方法。

关键词: 困难气道识别, 多视图学习, 数据增强, 类别数量不平衡, 特征融合, 注意力机制

CLC Number: