Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (4): 1062-1070.DOI: 10.11772/j.issn.1001-9081.2022020270
• Artificial intelligence • Previous Articles
Rong GAO1,2(), Jiawei SHEN1, Xiongkai SHAO1, Xinyun WU1
Received:
2022-03-09
Revised:
2022-05-20
Accepted:
2022-05-20
Online:
2022-08-16
Published:
2023-04-10
Contact:
Rong GAO
About author:
SHEN Jiawei, born in 1998, M. S. candidate. His research interests include object detection, instance segmentation.Supported by:
通讯作者:
高榕
作者简介:
沈加伟(1998—),男,湖北黄冈人,硕士研究生,主要研究方向:目标检测、实例分割;基金资助:
CLC Number:
Rong GAO, Jiawei SHEN, Xiongkai SHAO, Xinyun WU. Instance segmentation algorithm based on Fastformer and self-supervised contrastive learning[J]. Journal of Computer Applications, 2023, 43(4): 1062-1070.
高榕, 沈加伟, 邵雄凯, 吴歆韵. 基于Fastformer和自监督对比学习的实例分割算法[J]. 《计算机应用》唯一官方网站, 2023, 43(4): 1062-1070.
Add to citation manager EndNote|Ris|BibTeX
URL: http://www.joca.cn/EN/10.11772/j.issn.1001-9081.2022020270
算法 | 速率/(frame·s-1) | mAP/% | AP50 /% | 不同类别的AP/% | |||
---|---|---|---|---|---|---|---|
person | rider | car | bus | ||||
SGN | 0.61 | 25.0 | 44.9 | 21.8 | 20.1 | 39.4 | 33.2 |
Mask R-CNN | 0.72 | 25.7 | 45.2 | 20.6 | 23.2 | 40.5 | 32.7 |
MEInst | 1.34 | 26.4 | 47.1 | 23.7 | 24.9 | 42.7 | 33.1 |
PANet | 2.24 | 26.2 | 49.9 | 30.5 | 23.7 | 46.9 | 32.2 |
CondInst | 0.84 | 31.8 | 57.1 | 36.8 | 30.4 | 54.8 | 36.3 |
Deep snake | 4.25 | 32.4 | 56.9 | 37.0 | 31.9 | 56.8 | 38.6 |
BlendMask | 4.67 | 31.7 | 58.4 | 37.2 | 27.0 | 56.0 | 40.5 |
SOLOv2 | 4.76 | 32.6 | 59.0 | 35.4 | 30.6 | 56.9 | 41.5 |
本文算法 | 4.49 | 35.7 | 62.3 | 36.9 | 32.1 | 60.3 | 44.7 |
Tab. 1 Comparison of experimental results on Cityscapes dataset
算法 | 速率/(frame·s-1) | mAP/% | AP50 /% | 不同类别的AP/% | |||
---|---|---|---|---|---|---|---|
person | rider | car | bus | ||||
SGN | 0.61 | 25.0 | 44.9 | 21.8 | 20.1 | 39.4 | 33.2 |
Mask R-CNN | 0.72 | 25.7 | 45.2 | 20.6 | 23.2 | 40.5 | 32.7 |
MEInst | 1.34 | 26.4 | 47.1 | 23.7 | 24.9 | 42.7 | 33.1 |
PANet | 2.24 | 26.2 | 49.9 | 30.5 | 23.7 | 46.9 | 32.2 |
CondInst | 0.84 | 31.8 | 57.1 | 36.8 | 30.4 | 54.8 | 36.3 |
Deep snake | 4.25 | 32.4 | 56.9 | 37.0 | 31.9 | 56.8 | 38.6 |
BlendMask | 4.67 | 31.7 | 58.4 | 37.2 | 27.0 | 56.0 | 40.5 |
SOLOv2 | 4.76 | 32.6 | 59.0 | 35.4 | 30.6 | 56.9 | 41.5 |
本文算法 | 4.49 | 35.7 | 62.3 | 36.9 | 32.1 | 60.3 | 44.7 |
算法 | 速率/(frame·s-1) | AP/% | |||
---|---|---|---|---|---|
小目标 | 中目标 | 大目标 | |||
SGN | 10.2 | 35.7 | 19.8 | 38.7 | 47.2 |
Mask R-CNN | 15.3 | 37.5 | 21.1 | 39.6 | 48.3 |
MEInst | 15.0 | 33.5 | 19.3 | 35.7 | 42.1 |
PANet | 14.6 | 32.7 | 20.1 | 36.8 | 44.5 |
CondInst | 15.4 | 37.8 | 21.0 | 40.3 | 48.7 |
Deep snake | 14.2 | 38.0 | 20.8 | 41.8 | 52.3 |
BlendMask | 15.0 | 37.8 | 18.8 | 40.9 | 53.6 |
SOLOv2 | 13.5 | 38.2 | 17.6 | 41.2 | 55.4 |
本文算法 | 12.7 | 40.7 | 21.3 | 43.9 | 57.5 |
Tab. 2 Comparison of experimental results on COCO2017 dataset
算法 | 速率/(frame·s-1) | AP/% | |||
---|---|---|---|---|---|
小目标 | 中目标 | 大目标 | |||
SGN | 10.2 | 35.7 | 19.8 | 38.7 | 47.2 |
Mask R-CNN | 15.3 | 37.5 | 21.1 | 39.6 | 48.3 |
MEInst | 15.0 | 33.5 | 19.3 | 35.7 | 42.1 |
PANet | 14.6 | 32.7 | 20.1 | 36.8 | 44.5 |
CondInst | 15.4 | 37.8 | 21.0 | 40.3 | 48.7 |
Deep snake | 14.2 | 38.0 | 20.8 | 41.8 | 52.3 |
BlendMask | 15.0 | 37.8 | 18.8 | 40.9 | 53.6 |
SOLOv2 | 13.5 | 38.2 | 17.6 | 41.2 | 55.4 |
本文算法 | 12.7 | 40.7 | 21.3 | 43.9 | 57.5 |
算法 | 速率/(frame·s-1) | ||
---|---|---|---|
Baseline | 32.6 | 59.0 | 4.76 |
Baseline+Fast | 34.5 | 61.2 | 4.52 |
Baseline+Fast+Cont | 35.7 | 62.3 | 4.49 |
Tab. 3 Experimental results of module analysis
算法 | 速率/(frame·s-1) | ||
---|---|---|---|
Baseline | 32.6 | 59.0 | 4.76 |
Baseline+Fast | 34.5 | 61.2 | 4.52 |
Baseline+Fast+Cont | 35.7 | 62.3 | 4.49 |
1 | WU D H, LV S C, JIANG M, et al. Using channel pruning-based YOLO v4 deep learning algorithm for the real-time and accurate detection of apple flowers in natural environments[J]. Computers and Electronics in Agriculture, 2020, 178: No.105742. 10.1016/j.compag.2020.105742 |
2 | 马佳良,陈斌,孙晓飞. 基于改进的Faster R-CNN的通用目标检测框架[J]. 计算机应用, 2021, 41(9):2712-2719. 10.11772/j.issn.1001-9081.2020111852 |
MA J L, CHEN B, SUN X F. General object detection framework based on improved Faster R-CNN[J]. Journal of Computer Applications, 2021, 41(9): 2712-2719. 10.11772/j.issn.1001-9081.2020111852 | |
3 | HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 7132-7141. 10.1109/cvpr.2018.00745 |
4 | WANG P Q, CHEN P F, YUAN Y, et al. Understanding convolution for semantic segmentation[C]// Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision. Piscataway: IEEE, 2018: 1451-1460. 10.1109/wacv.2018.00163 |
5 | 杨贞,彭小宝,朱强强,等. 基于Deeplab V3 Plus的自适应注意力机制图像分割算法[J]. 计算机应用, 2022, 42(1):230-238. 10.11772/j.issn.1001-9081.2021010137 |
YANG Z, PENG X B, ZHU Q Q, et al. Image segmentation algorithm with adaptive attention mechanism based on Deeplab V3 Plus[J]. Journal of Computer Applications, 2022, 42(1): 230-238. 10.11772/j.issn.1001-9081.2021010137 | |
6 | WANG X L, KONG T, SHEN C H, et al. SOLO: segmenting objects by locations[C]// Proceedings of the 2020 European Conference on Computer Vision, LNCS 12363. Cham: Springer, 2020: 649-665. |
7 | WANG X L, ZHANG R F, KONG T, et al. SOLOv2: dynamic and fast instance segmentation[C/OL]// Proceedings of the 34th Conference on Neural Information Processing System. [2022-01-23].. |
8 | CORDTS M, OMRAN M, RAMOS S, et al. The Cityscapes dataset for semantic urban scene understanding[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 3213-3223. 10.1109/cvpr.2016.350 |
9 | WU C H, WU F Z, QI T, et al. Fastformer: additive attention can be all you need[EB/OL]. (2021-09-05) [2022-01-23].. |
10 | ZHOU X Y, ZHUO J C, KRÄHENBÜHL P. Bottom-up object detection by grouping extreme and center points[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 850-859. 10.1109/cvpr.2019.00094 |
11 | XIE E Z, SUN P Z, SONG X G, et al. PolarMask: single shot instance segmentation with polar representation[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 12190-12199. 10.1109/cvpr42600.2020.01221 |
12 | RIAZ H U M, BENBARKA N, ZELL A. FourierNet: compact mask representation for instance segmentation using differentiable shape decoders[C]// Proceedings of the 25th International Conference on Pattern Recognition. Piscataway: IEEE, 2021: 7833-7840. 10.1109/icpr48806.2021.9413048 |
13 | BOLYA D, ZHOU C, XIAO F Y, et al. YOLACT: real-time instance segmentation[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 9156-9165. 10.1109/iccv.2019.00925 |
14 | CHEN H, SUN K Y, TIAN Z, et al. BlendMask: top-down meets bottom-up for instance segmentation[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020:8570-8578. 10.1109/cvpr42600.2020.00860 |
15 | TIAN Z, SHEN C H, CHEN H. Conditional convolutions for instance segmentation[C]// Proceedings of the 2020 European Conference on Computer Vision, LNCS 12346. Cham: Springer, 2020: 282-298. |
16 | CARION N, MASSA F, SYNNAEVE G, et al. End-to-end object detection with transformers[C]// Proceedings of the 2020 European Conference on Computer Vision, LNCS 12346. Cham: Springer, 2020: 213-229. |
17 | HU J, CAO L J, LU Y, et al. ISTR: end-to-end instance segmentation with Transformers[EB/OL]. (2021-05-06) [2022-01-24].. 10.1109/cvpr46437.2021.00863 |
18 | DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: Transformers for image recognition at scale[EB/OL]. (2021-06-03) [2022-01-24].. |
19 | TOUVRON H, CORD M, DOUZE M, et al. Training data-efficient image transformers & distillation through attention[C]// Proceedings of the 38th International Conference on Machine Learning. New York: JMLR.org, 2021: 10347-10357. 10.1109/iccv48922.2021.00091 |
20 | HAN K, XIAO A, WU E H, et al. Transformer in Transformer[C/OL]// Proceedings of the 35th Conference on Neural Information Processing Systems. [2022-01-26].. |
21 | TIAN Y X, NEWSAM S, BOAKYE K. Image search with text feedback by additive attention compositional learning[EB/OL]. (2022-03-08) [2022-03-20].. 10.1109/wacv56688.2023.00107 |
22 | KIM S W, MIN J, CHO M. Visual TransforMatcher: efficient match-to-match attention for visual correspondence[EB/OL]. (2021-10-06) [2022-01-29].. 10.1109/cvpr52688.2022.00850 |
23 | HONG S, CHO S, NAM J, et al. Cost aggregation is all you need for few-shot segmentation[EB/OL]. (2021-12-22) [2022-01-24].. 10.1007/978-3-031-19818-2_7 |
24 | HE K M, FAN H Q, WU Y X, et al. Momentum contrast for unsupervised visual representation learning[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020:9726-9735. 10.1109/cvpr42600.2020.00975 |
25 | CHEN T, KORNBLITH S, NOROUZI M, et al. A simple framework for contrastive learning of visual representations[C]// Proceedings of the 37th International Conference on Machine Learning. New York: JMLR.org, 2020: 1597-1607. |
26 | CHEN X L, FAN H Q, GIRSHICK R, et al. Improved baselines with momentum contrastive learning[EB/OL]. (2020-03-09) [2022-01-24].. |
27 | CHEN X L, HE K M. Exploring simple Siamese representation learning[C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 15745-15753. 10.1109/cvpr46437.2021.01549 |
28 | DENG Z L, ZHONG Y J, GUO S, et al. InsCLR: improving instance retrieval with self-supervision[C]// Proceedings of the 36th AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2022: 516-524. 10.1609/aaai.v36i1.19930 |
29 | LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: common objects in context[C]// Proceedings of the 2014 European Conference on Computer Vision, LNCS 8693. Cham: Springer, 2014: 740-755. |
30 | LIU S, JIA J Y, FIDLER S, et al. SGN: sequential grouping networks for instance segmentation[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 3516-3524. 10.1109/iccv.2017.378 |
31 | HE K M, GKIOXARI G, DOLLÁR P, et al. Mask R-CNN[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 2980-2988. 10.1109/iccv.2017.322 |
32 | LIU S, QI L, QIN H F, et al. Path aggregation network for instance segmentation[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 8759-8768. 10.1109/cvpr.2018.00913 |
33 | ZHANG R F, TIAN Z, SHEN C H, et al. Mask encoding for single shot instance segmentation[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 10223-10232. 10.1109/cvpr42600.2020.01024 |
34 | PENG S D, JIANG W, PI H J, et al. Deep snake for real-time instance segmentation[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 8530-8539. 10.1109/cvpr42600.2020.00856 |
[1] | Mengting GE, Minghua WAN. Feature extraction model based on neighbor supervised locally invariant robust principal component analysis [J]. Journal of Computer Applications, 2023, 43(4): 1013-1020. |
[2] | You YANG, Ruhui ZHANG, Pengcheng XU, Kang KANG, Hao ZHAI. Improved U-Net for seal segmentation of Republican archives [J]. Journal of Computer Applications, 2023, 43(3): 943-948. |
[3] | Haifeng LI, Fan ZHANG, Minnan PIAO, Huaichao WANG, Nansha LI, Zhongcheng GUI. Automatic detection of targets under airport pavement based on channel and spatial attention [J]. Journal of Computer Applications, 2023, 43(3): 930-935. |
[4] | Qing JIA, Laihua WANG, Weisheng WANG. Anomaly detection in video via independently recurrent neural network and variational autoencoder network [J]. Journal of Computer Applications, 2023, 43(2): 507-513. |
[5] | Xinyu ZHANG, Sheng DING, Zhipei YANG. Traffic sign detection algorithm based on improved attention mechanism [J]. Journal of Computer Applications, 2022, 42(8): 2378-2385. |
[6] | Tingwei QIN, Pengcheng ZHAO, Pinle QIN, Jianchao ZENG, Rui CHAI, Yongqi HUANG. Point cloud registration algorithm based on residual attention mechanism [J]. Journal of Computer Applications, 2022, 42(7): 2184-2191. |
[7] | Tianhao QIU, Shurong CHEN. EfficientNet based dual-branch multi-scale integrated learning for pedestrian re-identification [J]. Journal of Computer Applications, 2022, 42(7): 2065-2071. |
[8] | Xingshuo DING, Xiang LI, Qian XIE. Enterprise portrait construction method based on label layering and deepening modeling [J]. Journal of Computer Applications, 2022, 42(4): 1170-1177. |
[9] | Changqing JI, Zhiyong GAO, Jing QIN, Zumin WANG. Review of image classification algorithms based on convolutional neural network [J]. Journal of Computer Applications, 2022, 42(4): 1044-1049. |
[10] | Ne LI, Guangzhu XU, Bangjun LEI, Guoliang MA, Yongtao SHI. Logo recognition algorithm for vehicles on traffic road [J]. Journal of Computer Applications, 2022, 42(3): 810-817. |
[11] | Nan XIANG, Chuanzhong PAN, Gaoxiang YU. Object detection algorithm combined with optimized feature extraction structure [J]. Journal of Computer Applications, 2022, 42(11): 3558-3563. |
[12] | Yu DU, Meng YAN, Xin WU. Non-intrusive load identification algorithm based on convolutional neural network with upsampling pyramid structure [J]. Journal of Computer Applications, 2022, 42(10): 3300-3306. |
[13] | Yi ZHANG, Hua WAN, Shuqin TU. Technical review and case study on classification of Chinese herbal slices based on computer vision [J]. Journal of Computer Applications, 2022, 42(10): 3224-3234. |
[14] | MA Jialiang, CHEN Bin, SUN Xiaofei. General object detection framework based on improved Faster R-CNN [J]. Journal of Computer Applications, 2021, 41(9): 2712-2719. |
[15] | ZHENG Zhiqiang, HU Xin, WENG Zhi, WANG Yuhe, CHENG Xi. Cattle eye image feature extraction method based on improved DenseNet [J]. Journal of Computer Applications, 2021, 41(9): 2780-2784. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||