《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (8): 2593-2600.DOI: 10.11772/j.issn.1001-9081.2021061075

• 多媒体计算与计算机仿真 • 上一篇    

融合多尺度多头自注意力和在线难例挖掘的小样本硅藻检测

邓杰航1, 郭文权1, 陈汉杰2, 顾国生1(), 刘景建3, 杜宇坤3, 刘超3, 康晓东3, 赵建3   

  1. 1.广东工业大学 计算机学院, 广州 510006
    2.广东工业大学 自动化学院, 广州 510006
    3.法医病理学公安部重点实验室(广州市刑事科学技术研究所), 广州 510442
  • 收稿日期:2021-06-25 修回日期:2022-03-24 接受日期:2022-04-02 发布日期:2022-04-19 出版日期:2022-08-10
  • 通讯作者: 顾国生
  • 作者简介:邓杰航(1979—),男,广东广州人,副教授,博士,主要研究方向:图像处理、目标识别;
    郭文权(1996—),男,广东广州人,硕士研究生,主要研究方向:图像处理、目标识别;
    陈汉杰(2000—),男,广东肇庆人,主要研究方向:深度学习、现场可编程门阵列;
    顾国生(1978—),男,广东新兴人,讲师,博士,主要研究方向:图像处理;
    刘景建(1995—),男,四川会理人,硕士研究生,主要研究方向:法医病理学;
    杜宇坤(1997—),男,河南信阳人,硕士研究生,主要研究方向:法医病理学;
    刘超(1963—),男,湖北丹江口人,主任法医师,博士,主要研究方向:个体识别、死因鉴定;
    康晓东(1988—),男,甘肃天水人,主要研究方向:法医病理学;
    赵建(1988—),男,四川成都人,硕士,主要研究方向:法医病理学、硅藻检验。
  • 基金资助:
    国家自然科学基金资助项目(61202267);广东工业大学创新训练项目(xj202111845544);广州市科技计划项目(2019030001)

Few-shot diatom detection combining multi-scale multi-head self-attention and online hard example mining

Jiehang DENG1, Wenquan GUO1, Hanjie CHEN2, Guosheng GU1(), Jingjian LIU3, Yukun DU3, Chao LIU3, Xiaodong KANG3, Jian ZHAO3   

  1. 1.School of Computer Science and Technology,Guangdong University of Technology,Guangzhou Guangdong 510006,China
    2.School of Automation,Guangdong University of Technology,Guangzhou Guangdong 510006,China
    3.Key Laboratory of Forensic Pathology,Ministry of Public Security (Guangzhou Forensic Science Institute),Guangzhou Guangdong 510442,China
  • Received:2021-06-25 Revised:2022-03-24 Accepted:2022-04-02 Online:2022-04-19 Published:2022-08-10
  • Contact: Guosheng GU
  • About author:DENG Jiehang, born in 1979, Ph. D., associate professor. His research interests include image processing, object recognition.
    GUO Wenquan, born in 1996, M. S. candidate. His research interests include image processing, object recognition.
    CHEN Hanjie, born in 2000. His research interests include deep learning, Field-Programmable Gate Array.
    GU Guosheng, born in 1978, Ph. D., lecturer. His research interests include image processing.
    LIU Jingjian, born in 1995, M. S. candidate. His research interests include engaging in forensic pathology.
    DU Yukun, born in 1997, M. S. candidate. His research interests include forensic pathology.
    LIU Chao, born in 1963, Ph. D., chief forensic physician. His research interests include individual identification, cause of death identification.
    KANG Xiaodong, born in 1988. His research interests include engaging in forensic pathology.
    ZHAO Jian, born in 1988, M. S. His research interests include forensic pathology, diatom detection.
  • Supported by:
    National Natural Science Foundation of China(61202267);Guangdong University of Technology Innovation Training Project(xj202111845544);Guangzhou Science and Technology Program(2019030001)

摘要:

硅藻训练样本量较少时,检测精度偏低,为此在小样本目标检测模型TFA(Two-stage Fine-tuning Approach)的基础上提出一种融合多尺度多头自注意力(MMS)和在线难例挖掘(OHEM)的小样本硅藻检测模型(MMSOFDD)。首先,结合ResNet-101与多头自注意力机制构造一个基于Transformer的特征提取网络BoTNet-101,以充分利用硅藻图像的局部和全局信息;然后,改进多头自注意力为MMS,消除了原始多头自注意力的处理目标尺度单一的局限性;最后,引入OHEM到模型预测器中,并对硅藻进行识别与定位。把所提模型与其他小样本目标检测模型在自建硅藻数据集上进行消融及对比实验。实验结果表明:与TFA相比,MMSOFDD的平均精度均值(mAP)为69.60%,TFA为63.71%,MMSOFDD提高了5.89个百分点;与小样本目标检测模型Meta R-CNN和FSIW相比,Meta R-CNN和FSIW的mAP分别为61.60%和60.90%,所提模型的mAP分别提高了8.00个百分点和8.70个百分点。而且,MMSOFDD在硅藻训练样本量少的条件下能够有效地提高检测模型对硅藻的检测精度。

关键词: 小样本, 硅藻检测, 卷积神经网络, Transformer, 在线难例挖掘, 多尺度多头自注意力

Abstract:

The detection precision is low when the diatom training sample size is small, so a Multi-scale Multi-head Self-attention (MMS) and Online Hard Example Mining (OHEM) based few-shot diatom detection model, namely MMSOFDD was proposed based on the few-shot object detection model Two-stage Fine-tuning Approach (TFA). Firstly, a Transformer-based feature extraction network Bottleneck Transformer Network-101 (BoTNet-101) was constructed by combining ResNet-101 with a multi-head self-attention mechanism to make full use of the local and global information of diatom images. Then, multi-head self-attention was improved to MMS, which eliminated the limitation of processing single object scale of the original multi-head self-attention. Finally, OHEM was introduced to the model predictor, and the diatoms were identified and localized. Ablation and comparison experiments between the proposed model and other few-shot object detection models were conducted on a self-constructed diatom dataset. Experiment results show that the mean Average Precision (mAP) of MMSOFDD is 69.60%, which is improved by 5.89 percentage points compared with 63.71% of TFA; and compared with 61.60% and 60.90% the few-shot object detection models Meta R-CNN and Few-Shot In Wild (FSIW), the proposed model has the mAP improved by 8.00 percentage points and 8.70 percentage points respectively. Moreover, MMSOFDD can effectively improve the detection precision of the detection model for diatoms with small size of diatom training samples.

Key words: few-shot, diatom detection, Convolutional Neural Network (CNN), Transformer, Online Hard Example Mining (OHEM), Multi-scale Multi-head Self-attention (MMS)

中图分类号: