Multi-view difficult airway recognition based on discriminant region guidance

doi:10.11772/j.issn.1001-9081.2024101404

Journal of Computer Applications

Received:2024-10-07 Revised:2025-01-08 Accepted:2025-01-09 Online:2025-01-13 Published:2025-01-13
Supported by:
Sichuan Science and Technology Program

基于判别区域引导的多视图困难气道识别

吴松霖¹,张广朝²,姚远³,彭博¹

1. 西南交通大学计算机与人工智能学院，成都 611756；2. 四川大学华西医院，麻醉科，成都 610044；
3. 四川大学华西医院，全科医学中心，成都 610044

通讯作者: 彭博
基金资助:
四川省科技计划项目

Abstract

Abstract: Difficult Airway (DA) is a critical preoperative risk factor in clinical surgery, yet its accurate identification faces numerous challenges, such as small dataset size, severe class imbalance, and insufficient single-view recognition capability. To address these limitations, a multi-view DA identification model, DRG-MV-Net (Discriminative Region Guided Multi-View Net), was proposed. In the first stage, the Discriminative Region Guided Module (DRGM) was employed to automatically detect and emphasize key discriminative regions in facial views using Class Activation Mapping (CAM), generating two types of augmented images with specific features. In the second stage, features for each view were extracted using ResNet-18 with the integrated Dilated Convolution Block Attention Module(D-CBAM) as the backbone, and multi-view feature integration was performed via the Multi-View Cross Fusion Module (MCFM). Focal Loss and stratified hybrid sampling were utilized to mitigate the class imbalance problem. Evaluation on the constructed clinical dataset demonstrated that the proposed model achieved a G-Mean of 77.22%, an F1-Score of 43.88%, a Matthews Correlation Coefficient (MCC) of 38.73%, and an Area Under the receiver operating Characteristic curve (AUC) of 0.7407. Compared with the recent method MCE-Net (Multi-view Contrastive representation prior and Ensemble classification Network), the G-Mean, F1-Score, and MCC improved by 2.41, 2.34, and 3.41 percentage points, respectively. Compared with the baseline model ResNet-18, these metrics improved by 4.85, 6.85, and 8.25 percentage points, respectively, verifying the effectiveness of the proposed method in DA identification on small, unbalanced datasets and providing new insights.

Key words: difficult airway recognition, multi-view learning, data augmentation, imbalanced number of categories, feature fusion, attention mechanism

摘要： 困难气道(DA)是临床手术中关键的术前风险因素，但它的准确识别面临诸多挑战，如数据集规模小、类别严重不平衡以及单视图识别能力不足等。为解决这些限制，提出多视图DA识别模型——DRG-MV-Net (Discriminative Region Guided Multi-View Net)。在模型第一阶段，判别区域引导模块(DRGM)借助类激活映射(CAM)自动检测并强调面部视图中的关键判别区域，生成具有特定特征的两种数据增强图像。在第二阶段，以集成扩张卷积块注意模块(D-CBAM)的ResNet-18骨干网络提取每个视图的特征，随后通过多视图交叉融合模块(MCFM)进行多视图特征集成。将Focal Loss与分层混合采样相结合，缓解类别不平衡问题。对所构建的临床数据集评估显示，所提模型实现了77.22%的几何平均准确率(G-Mean)、43.88%的F1-Score、38.73%的马修斯相关系数(MCC)和0.7407的受试者操作特征曲线下面积(AUC)。与近期相关研究的方法MCE-Net(Multi-view Contrastive representation prior and Ensemble classification Network)相比，G-Mean、F1-Score、MCC，分别提升了2.41、2.34和3.41个百分点；与基线模型ResNet-18相比，分别提升了4.85、6.85、8.25个百分点，验证了所提方法在小型、不平衡数据集中DA识别的有效性，为解决复杂的DA识别任务提供了新的见解和方法。

关键词: 困难气道识别, 多视图学习, 数据增强, 类别数量不平衡, 特征融合, 注意力机制

CLC Number:

TP391.4

吴松霖张广朝姚远彭博. 基于判别区域引导的多视图困难气道识别[J]. 《计算机应用》唯一官方网站, DOI: 10.11772/j.issn.1001-9081.2024101404.

[1]	Yiming LIANG, Jing FAN, Wenze CHAI. Multi-scale feature fusion sentiment classification based on bidirectional cross attention [J]. Journal of Computer Applications, 2025, 45(9): 2773-2782.
[2]	Chuang WANG, Lu YU, Jianwei CHEN, Cheng PAN, Wenbo DU. Review of open set domain adaptation [J]. Journal of Computer Applications, 2025, 45(9): 2727-2736.
[3]	Jinggang LYU, Shaorui PENG, Shuo GAO, Jin ZHOU. Speech enhancement network driven by complex frequency attention and multi-scale frequency enhancement [J]. Journal of Computer Applications, 2025, 45(9): 2957-2965.
[4]	Yilin DENG, Fajiang YU. Pseudo random number generator based on LSTM and separable self-attention mechanism [J]. Journal of Computer Applications, 2025, 45(9): 2893-2901.
[5]	Weigang LI, Jiale SHAO, Zhiqiang TIAN. Point cloud classification and segmentation network based on dual attention mechanism and multi-scale fusion [J]. Journal of Computer Applications, 2025, 45(9): 3003-3010.
[6]	Xiang WANG, Zhixiang CHEN, Guojun MAO. Multivariate time series prediction method combining local and global correlation [J]. Journal of Computer Applications, 2025, 45(9): 2806-2816.
[7]	Zhixiong XU, Bo LI, Xiaoyong BIAN, Qiren HU. Adversarial sample embedded attention U-Net for 3D medical image segmentation [J]. Journal of Computer Applications, 2025, 45(9): 3011-3016.
[8]	Fang WANG, Jing HU, Rui ZHANG, Wenting FAN. Medical image segmentation network with content-guided multi-angle feature fusion [J]. Journal of Computer Applications, 2025, 45(9): 3017-3025.
[9]	Li LI, Han SONG, Peihe LIU, Hanlin CHEN. Named entity recognition for sensitive information based on data augmentation and residual networks [J]. Journal of Computer Applications, 2025, 45(9): 2790-2797.
[10]	Chao JING, Yutao QUAN, Yan CHEN. Improved multi-layer perceptron and attention model-based power consumption prediction algorithm [J]. Journal of Computer Applications, 2025, 45(8): 2646-2655.
[11]	Jinhao LIN, Chuan LUO, Tianrui LI, Hongmei CHEN. Thoracic disease classification method based on cross-scale attention network [J]. Journal of Computer Applications, 2025, 45(8): 2712-2719.
[12]	Chengzhi YAN, Ying CHEN, Kai ZHONG, Han GAO. 3D object detection algorithm based on multi-scale network and axial attention [J]. Journal of Computer Applications, 2025, 45(8): 2537-2545.
[13]	Haifeng WU, Liqing TAO, Yusheng CHENG. Partial label regression algorithm integrating feature attention and residual connection [J]. Journal of Computer Applications, 2025, 45(8): 2530-2536.
[14]	Yimeng XI, Zhen DENG, Qian LIU, Libo LIU. Cross-modal information fusion for video-text retrieval [J]. Journal of Computer Applications, 2025, 45(8): 2448-2456.
[15]	Jin ZHOU, Yuzhi LI, Xu ZHANG, Shuo GAO, Li ZHANG, Jiachuan SHENG. Modulation recognition network for complex electromagnetic environments [J]. Journal of Computer Applications, 2025, 45(8): 2672-2682.