《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (8): 2593-2600.DOI: 10.11772/j.issn.1001-9081.2021061075
• 多媒体计算与计算机仿真 • 上一篇
邓杰航1, 郭文权1, 陈汉杰2, 顾国生1(), 刘景建3, 杜宇坤3, 刘超3, 康晓东3, 赵建3
收稿日期:
2021-06-25
修回日期:
2022-03-24
接受日期:
2022-04-02
发布日期:
2022-04-19
出版日期:
2022-08-10
通讯作者:
顾国生
作者简介:
邓杰航(1979—),男,广东广州人,副教授,博士,主要研究方向:图像处理、目标识别;基金资助:
Jiehang DENG1, Wenquan GUO1, Hanjie CHEN2, Guosheng GU1(), Jingjian LIU3, Yukun DU3, Chao LIU3, Xiaodong KANG3, Jian ZHAO3
Received:
2021-06-25
Revised:
2022-03-24
Accepted:
2022-04-02
Online:
2022-04-19
Published:
2022-08-10
Contact:
Guosheng GU
About author:
DENG Jiehang, born in 1979, Ph. D., associate professor. His research interests include image processing, object recognition.Supported by:
摘要:
硅藻训练样本量较少时,检测精度偏低,为此在小样本目标检测模型TFA(Two-stage Fine-tuning Approach)的基础上提出一种融合多尺度多头自注意力(MMS)和在线难例挖掘(OHEM)的小样本硅藻检测模型(MMSOFDD)。首先,结合ResNet-101与多头自注意力机制构造一个基于Transformer的特征提取网络BoTNet-101,以充分利用硅藻图像的局部和全局信息;然后,改进多头自注意力为MMS,消除了原始多头自注意力的处理目标尺度单一的局限性;最后,引入OHEM到模型预测器中,并对硅藻进行识别与定位。把所提模型与其他小样本目标检测模型在自建硅藻数据集上进行消融及对比实验。实验结果表明:与TFA相比,MMSOFDD的平均精度均值(mAP)为69.60%,TFA为63.71%,MMSOFDD提高了5.89个百分点;与小样本目标检测模型Meta R-CNN和FSIW相比,Meta R-CNN和FSIW的mAP分别为61.60%和60.90%,所提模型的mAP分别提高了8.00个百分点和8.70个百分点。而且,MMSOFDD在硅藻训练样本量少的条件下能够有效地提高检测模型对硅藻的检测精度。
中图分类号:
邓杰航, 郭文权, 陈汉杰, 顾国生, 刘景建, 杜宇坤, 刘超, 康晓东, 赵建. 融合多尺度多头自注意力和在线难例挖掘的小样本硅藻检测[J]. 计算机应用, 2022, 42(8): 2593-2600.
Jiehang DENG, Wenquan GUO, Hanjie CHEN, Guosheng GU, Jingjian LIU, Yukun DU, Chao LIU, Xiaodong KANG, Jian ZHAO. Few-shot diatom detection combining multi-scale multi-head self-attention and online hard example mining[J]. Journal of Computer Applications, 2022, 42(8): 2593-2600.
种类 | 种类数 | 实例数 |
---|---|---|
合计 | 20 335 | 34 084 |
aeroplane | 908 | 1 171 |
bicycle | 795 | 1 064 |
boat | 689 | 1 140 |
bottle | 950 | 1 764 |
car | 1 874 | 3 267 |
cat | 1 417 | 1 593 |
chair | 1 564 | 3 152 |
diningtable | 738 | 824 |
dog | 1 707 | 2 015 |
horse | 769 | 1 072 |
person | 6 095 | 13 256 |
pottedplant | 772 | 1 487 |
sheep | 421 | 1 070 |
train | 805 | 925 |
tvmonitor | 831 | 1 108 |
表1 Pascal VOC数据集的15类分类明细
Tab. 1 Classification details of 15 classes in Pascal VOC dataset
种类 | 种类数 | 实例数 |
---|---|---|
合计 | 20 335 | 34 084 |
aeroplane | 908 | 1 171 |
bicycle | 795 | 1 064 |
boat | 689 | 1 140 |
bottle | 950 | 1 764 |
car | 1 874 | 3 267 |
cat | 1 417 | 1 593 |
chair | 1 564 | 3 152 |
diningtable | 738 | 824 |
dog | 1 707 | 2 015 |
horse | 769 | 1 072 |
person | 6 095 | 13 256 |
pottedplant | 772 | 1 487 |
sheep | 421 | 1 070 |
train | 805 | 925 |
tvmonitor | 831 | 1 108 |
种类 | 种类数 | 实例数 |
---|---|---|
合计 | 2 606 | 2 652 |
小环藻 | 400 | 417 |
舟形藻 | 400 | 404 |
菱形藻 | 400 | 407 |
针杆藻 | 280 | 283 |
异极藻 | 220 | 222 |
桥弯藻 | 346 | 347 |
卵形藻 | 160 | 161 |
直链藻 | 400 | 411 |
表2 硅藻数据集统计信息
Tab. 2 Diatom dataset statistics
种类 | 种类数 | 实例数 |
---|---|---|
合计 | 2 606 | 2 652 |
小环藻 | 400 | 417 |
舟形藻 | 400 | 404 |
菱形藻 | 400 | 407 |
针杆藻 | 280 | 283 |
异极藻 | 220 | 222 |
桥弯藻 | 346 | 347 |
卵形藻 | 160 | 161 |
直链藻 | 400 | 411 |
种类 | 实例数 | 种类 | 实例数 |
---|---|---|---|
小环藻 | 10 | 桥弯藻 | 10 |
舟形藻 | 10 | 卵形藻 | 10 |
菱形藻 | 10 | 直链藻 | 10 |
针杆藻 | 10 | 合计 | 80 |
异极藻 | 10 |
表3 小样本硅藻训练集
Tab. 3 Few-shot diatom training set
种类 | 实例数 | 种类 | 实例数 |
---|---|---|---|
小环藻 | 10 | 桥弯藻 | 10 |
舟形藻 | 10 | 卵形藻 | 10 |
菱形藻 | 10 | 直链藻 | 10 |
针杆藻 | 10 | 合计 | 80 |
异极藻 | 10 |
种类 | 实例数 | 种类 | 实例数 |
---|---|---|---|
小环藻 | 407 | 桥弯藻 | 337 |
舟形藻 | 394 | 卵形藻 | 151 |
菱形藻 | 397 | 直链藻 | 401 |
针杆藻 | 273 | 合计 | 2 572 |
异极藻 | 212 |
表4 硅藻测试集
Tab. 4 Diatom test set
种类 | 实例数 | 种类 | 实例数 |
---|---|---|---|
小环藻 | 407 | 桥弯藻 | 337 |
舟形藻 | 394 | 卵形藻 | 151 |
菱形藻 | 397 | 直链藻 | 401 |
针杆藻 | 273 | 合计 | 2 572 |
异极藻 | 212 |
方法 | 多尺度多头自注意力 | OHEM | mAP/% |
---|---|---|---|
TFA | 63.71 | ||
改进方法1 | 65.18 | ||
改进方法2 | 66.83 | ||
MMSOFDD | 69.60 |
表5 不同模块的mAP比较
Tab. 5 Comparison of mAP of different modules
方法 | 多尺度多头自注意力 | OHEM | mAP/% |
---|---|---|---|
TFA | 63.71 | ||
改进方法1 | 65.18 | ||
改进方法2 | 66.83 | ||
MMSOFDD | 69.60 |
类别 | 模型 | AP/% | mAP/% | 标准差 | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|
小环藻 | 舟形藻 | 菱形藻 | 针杆藻 | 异极藻 | 桥弯藻 | 卵形藻 | 直链藻 | ||||
基于元学习 | FSRW[ | 90.57 | 44.36 | 57.71 | 79.83 | 47.14 | 50.61 | 79.51 | 85.39 | 66.89 | 17.60 |
Meta R-CNN [ | 89.70 | 39.70 | 51.40 | 78.20 | 33.20 | 36.90 | 78.60 | 85.40 | 61.60 | 22.14 | |
FSIW[ | 90.10 | 45.70 | 58.80 | 76.70 | 32.90 | 40.80 | 82.60 | 81.10 | 60.90 | 20.45 | |
基于迁移学习 | Context-Transformer [ | 17.49 | 13.09 | 16.28 | 12.38 | 9.79 | 12.46 | 5.95 | 21.23 | 13.58 | 4.42 |
MPSR[ | 89.86 | 31.93 | 51.58 | 82.37 | 46.23 | 37.18 | 79.84 | 68.25 | 60.91 | 20.67 | |
基于度量学习 | DeFRCN[ | 93.37 | 33.04 | 49.41 | 77.29 | 41.63 | 49.96 | 84.05 | 88.33 | 64.64 | 22.08 |
基于微调 | FSCE[ | 90.14 | 33.08 | 48.48 | 74.76 | 42.56 | 46.04 | 75.61 | 79.39 | 61.26 | 19.65 |
MMSOFDD | 90.85 | 55.37 | 59.10 | 79.51 | 62.60 | 44.14 | 81.48 | 75.18 | 69.60 | 14.67 |
表6 不同模型对8类硅藻的AP、mAP和AP标准差的比较
Tab. 6 Comparison of AP, mAP and standard deviation of AP among different models for 8 species of diatoms
类别 | 模型 | AP/% | mAP/% | 标准差 | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|
小环藻 | 舟形藻 | 菱形藻 | 针杆藻 | 异极藻 | 桥弯藻 | 卵形藻 | 直链藻 | ||||
基于元学习 | FSRW[ | 90.57 | 44.36 | 57.71 | 79.83 | 47.14 | 50.61 | 79.51 | 85.39 | 66.89 | 17.60 |
Meta R-CNN [ | 89.70 | 39.70 | 51.40 | 78.20 | 33.20 | 36.90 | 78.60 | 85.40 | 61.60 | 22.14 | |
FSIW[ | 90.10 | 45.70 | 58.80 | 76.70 | 32.90 | 40.80 | 82.60 | 81.10 | 60.90 | 20.45 | |
基于迁移学习 | Context-Transformer [ | 17.49 | 13.09 | 16.28 | 12.38 | 9.79 | 12.46 | 5.95 | 21.23 | 13.58 | 4.42 |
MPSR[ | 89.86 | 31.93 | 51.58 | 82.37 | 46.23 | 37.18 | 79.84 | 68.25 | 60.91 | 20.67 | |
基于度量学习 | DeFRCN[ | 93.37 | 33.04 | 49.41 | 77.29 | 41.63 | 49.96 | 84.05 | 88.33 | 64.64 | 22.08 |
基于微调 | FSCE[ | 90.14 | 33.08 | 48.48 | 74.76 | 42.56 | 46.04 | 75.61 | 79.39 | 61.26 | 19.65 |
MMSOFDD | 90.85 | 55.37 | 59.10 | 79.51 | 62.60 | 44.14 | 81.48 | 75.18 | 69.60 | 14.67 |
1 | PIETTE M H A, DE LETTER E A. Drowning: still a difficult autopsy diagnosis[J]. Forensic Science International, 2006, 163(1/2): 1-9. 10.1016/j.forsciint.2004.10.027 |
2 | ZAHNG P P, KANG X D, ZHANG S R, et al. The length and width of diatoms in drowning cases as the evidence of diatoms penetrating the alveoli-capillary barrier[J]. International Journal of Legal Medicine, 2020, 134(3): 1037-1042. 10.1007/s00414-019-02164-4 |
3 | PEDRAZA A, BUENO G, DENIZ O, et al. Automated diatom classification (part B): a deep learning approach[J]. Applied Sciences, 2017, 7(5): No.460. 10.3390/app7050460 |
4 | ZHOU Y Y, ZHANG J, HUANG J, et al. Digital whole-slide image analysis for automated diatom test in forensic cases of drowning using a convolutional neural network algorithm[J]. Forensic Science International, 2019, 302: No.109922. 10.1016/j.forsciint.2019.109922 |
5 | 邓杰航,何冬冬,卓家鸿,等.复杂背景干扰下硅藻图像的深度网络识别与定位[J].南方医科大学学报, 2020, 40(2): 183-189. 10.12122/j.issn.1673-4254.2020.02.03 |
DENG J H, HE D D, ZHUO J H, et al. Deep learning network-based recognition and localization of diatom images against complex background[J]. Journal of Southern Medical University, 2020, 40(2): 183-189. 10.12122/j.issn.1673-4254.2020.02.03 | |
6 | REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. 10.1109/tpami.2016.2577031 |
7 | KRAUSE L M K, KOC J, ROSENHAHN B, et al. Fully convolutional neural network for detection and counting of diatoms on coatings after short-term field exposure[J]. Environmental Science and Technology, 2020, 54(16): 10022-10030. 10.1021/acs.est.0c01982 |
8 | LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation [C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2015: 3431-3440. 10.1109/cvpr.2015.7298965 |
9 | YU W M, XUE Y, KNOOPS R, et al. Automated diatom searching in the digital scanning electron microscopy images of drowning cases using the deep neural networks[J]. International Journal of Legal Medicine, 2021, 135(2): 497-508. 10.1007/s00414-020-02392-z |
10 | LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection [C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 2999-3007. 10.1109/iccv.2017.324 |
11 | KANG B Y, LIU Z, WANG X, et al. Few-shot object detection via feature reweighting [C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 8419-8428. 10.1109/iccv.2019.00851 |
12 | XIAO Y, MARLET R. Few-shot object detection and viewpoint estimation for objects in the wild [C]// Proceedings of the 2020 European Conference on Computer Vision, LNCS 12362. Cham: Springer, 2020: 192-210. |
13 | YAN X P, CHEN Z L, XU A N, et al. Meta R-CNN: towards general solver for instance-level low-shot learning [C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 9576-9585. 10.1109/iccv.2019.00967 |
14 | FAN Q, ZHUO W, TANG C K, et al. Few-shot object detection with attention-RPN and multi-relation detector [C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 4012-4021. 10.1109/cvpr42600.2020.00407 |
15 | KARLINSKY L, SHTOK J, HARARY S, et al. RepMet: representative-based metric learning for classification and few-shot object detection [C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 5192-5201. 10.1109/cvpr.2019.00534 |
16 | YANG Z, WANG Y L, CHEN X Y, et al. Context-transformer: tackling object confusion for few-shot detection [C]// Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2020: 12653-12660. 10.1609/aaai.v34i07.6957 |
17 | CHEN H, WANG Y L, WANG G Y, et al. LSTD: a low-shot transfer detector for object detection [C]// Proceedings of the 32nd AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2018: 2836-2843. 10.1609/aaai.v32i1.11716 |
18 | WANG X, HUANG T E, DARRELL T, et al. Frustratingly simple few-shot object detection [C]// Proceedings of the 37th International Conference on Machine Learning. New York: JMLR.org, 2020: 9919-9928. |
19 | SUN B, LI B H, CAI S C, et al. FSCE: few-shot object detection via contrastive proposal encoding [C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 7348-7358. 10.1109/cvpr46437.2021.00727 |
20 | EVERINGHAM M, VAN GOOL L, WILLIAMS C K I, et al. The PASCAL Visual Object Classes (VOC) challenge[J]. International Journal of Computer Vision, 2010, 88(2): 303-338. 10.1007/s11263-009-0275-4 |
21 | LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: common objects in context [C]// Proceedings of the 2014 European Conference on Computer Vision, LNCS 8693. Cham: Springer, 2014: 740-755. |
22 | SHRIVASTAVA A, GUPTA A, GIRSHICK R. Training region-based object detectors with online hard example mining [C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 761-769. 10.1109/cvpr.2016.89 |
23 | HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition [C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 770-778. 10.1109/cvpr.2016.90 |
24 | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need [C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook, NY: Curran Associates Inc., 2017: 6000-6010. 10.1016/s0262-4079(17)32358-8 |
25 | SRINIVAS A, LIN T Y, PARMAR N, et al. Bottleneck transformers for visual recognition [C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 16514-16524. 10.1109/cvpr46437.2021.01625 |
26 | RAMACHANDRAN P, ZOPH B, LE Q V. Searching for activation functions[EB/OL]. (2017-10-27) [2021-06-16]. . |
27 | BELLO I, ZOPH B, LE Q V, et al. Attention augmented convolutional networks [C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 3285-3294. 10.1109/iccv.2019.00338 |
28 | NEUBECK A, VAN GOOL L. Efficient non-maximum suppression [C]// Proceedings of the 18th International Conference on Pattern Recognition. Piscataway: IEEE, 2006: 850-855. 10.1109/icpr.2006.479 |
29 | WU J X, LIU S T, HUANG D, et al. Multi-scale positive sample refinement for few-shot object detection [C]// Proceedings of the 2020 European Conference on Computer Vision, LNCS 12361. Cham: Springer, 2020: 456-472. |
30 | QIAO L M, ZHAO Y X, LI Z Y, et al. DeFRCN: decoupled faster R-CNN for few-shot object detection [C]// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 8661-8670. 10.1109/iccv48922.2021.00856 |
[1] | 徐成霞, 阎庆, 李腾, 苗开超. 基于联合注意力机制的单幅图像去雨算法[J]. 《计算机应用》唯一官方网站, 2022, 42(8): 2578-2585. |
[2] | 张显杰, 张之明. 基于卷积神经网络和Transformer的手写体英文文本识别[J]. 《计算机应用》唯一官方网站, 2022, 42(8): 2394-2400. |
[3] | 程南江, 余贞侠, 陈琳, 乔贺辙. 基于领域自适应的多源多标签行人属性识别[J]. 《计算机应用》唯一官方网站, 2022, 42(8): 2401-2406. |
[4] | 张剑, 程培源, 邵思羽. 基于改进残差卷积自编码网络的类自适应旋转机械故障诊断[J]. 《计算机应用》唯一官方网站, 2022, 42(8): 2440-2449. |
[5] | 吕振虎, 许新征, 张芳艳. 基于挤压激励的轻量化注意力机制模块[J]. 《计算机应用》唯一官方网站, 2022, 42(8): 2353-2360. |
[6] | 靳华中, 张修洋, 叶志伟, 张闻其, 夏小鱼. 基于近似U型网络结构的图像去噪模型[J]. 《计算机应用》唯一官方网站, 2022, 42(8): 2571-2577. |
[7] | 王震宇, 张雷, 高文彬, 权威铭. 基于渐进式神经网络架构搜索的人体运动识别[J]. 《计算机应用》唯一官方网站, 2022, 42(7): 2058-2064. |
[8] | 刘万军, 王佳铭, 曲海成, 董利兵, 曹欣宇. 基于频谱空间域特征注意的音乐流派分类算法[J]. 《计算机应用》唯一官方网站, 2022, 42(7): 2072-2077. |
[9] | 谭湘粤, 胡晓, 杨佳信, 向俊将. 基于递进式特征增强聚合的伪装目标检测[J]. 《计算机应用》唯一官方网站, 2022, 42(7): 2192-2200. |
[10] | 董宁, 程晓荣, 张铭泉. 基于物联网平台的动态权重损失函数入侵检测系统[J]. 《计算机应用》唯一官方网站, 2022, 42(7): 2118-2124. |
[11] | 王海起, 王志海, 李留珂, 孔浩然, 王琼, 徐建波. 基于网格划分的城市短时交通流量时空预测模型[J]. 《计算机应用》唯一官方网站, 2022, 42(7): 2274-2280. |
[12] | 苏珊, 张杨, 张冬雯. 基于深度学习的耦合度相关代码坏味检测方法[J]. 《计算机应用》唯一官方网站, 2022, 42(6): 1702-1707. |
[13] | 杨磊, 赵红东, 于快快. 基于多头注意力机制的端到端语音情感识别[J]. 《计算机应用》唯一官方网站, 2022, 42(6): 1869-1875. |
[14] | 廖光锴, 张正, 宋治国. 基于小波特征与注意力机制结合的卷积网络车辆重识别[J]. 《计算机应用》唯一官方网站, 2022, 42(6): 1876-1883. |
[15] | 于蒙, 何文涛, 周绪川, 崔梦天, 吴克奇, 周文杰. 推荐系统综述[J]. 《计算机应用》唯一官方网站, 2022, 42(6): 1898-1913. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||