Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (4): 1062-1070.DOI: 10.11772/j.issn.1001-9081.2022020270
Special Issue: 人工智能
• Artificial intelligence • Previous Articles Next Articles
Rong GAO1,2(), Jiawei SHEN1, Xiongkai SHAO1, Xinyun WU1
Received:
2022-03-09
Revised:
2022-05-20
Accepted:
2022-05-20
Online:
2022-08-16
Published:
2023-04-10
Contact:
Rong GAO
About author:
SHEN Jiawei, born in 1998, M. S. candidate. His research interests include object detection, instance segmentation.Supported by:
通讯作者:
高榕
作者简介:
沈加伟(1998—),男,湖北黄冈人,硕士研究生,主要研究方向:目标检测、实例分割;基金资助:
CLC Number:
Rong GAO, Jiawei SHEN, Xiongkai SHAO, Xinyun WU. Instance segmentation algorithm based on Fastformer and self-supervised contrastive learning[J]. Journal of Computer Applications, 2023, 43(4): 1062-1070.
高榕, 沈加伟, 邵雄凯, 吴歆韵. 基于Fastformer和自监督对比学习的实例分割算法[J]. 《计算机应用》唯一官方网站, 2023, 43(4): 1062-1070.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2022020270
算法 | 速率/(frame·s-1) | mAP/% | AP50 /% | 不同类别的AP/% | |||
---|---|---|---|---|---|---|---|
person | rider | car | bus | ||||
SGN | 0.61 | 25.0 | 44.9 | 21.8 | 20.1 | 39.4 | 33.2 |
Mask R-CNN | 0.72 | 25.7 | 45.2 | 20.6 | 23.2 | 40.5 | 32.7 |
MEInst | 1.34 | 26.4 | 47.1 | 23.7 | 24.9 | 42.7 | 33.1 |
PANet | 2.24 | 26.2 | 49.9 | 30.5 | 23.7 | 46.9 | 32.2 |
CondInst | 0.84 | 31.8 | 57.1 | 36.8 | 30.4 | 54.8 | 36.3 |
Deep snake | 4.25 | 32.4 | 56.9 | 37.0 | 31.9 | 56.8 | 38.6 |
BlendMask | 4.67 | 31.7 | 58.4 | 37.2 | 27.0 | 56.0 | 40.5 |
SOLOv2 | 4.76 | 32.6 | 59.0 | 35.4 | 30.6 | 56.9 | 41.5 |
本文算法 | 4.49 | 35.7 | 62.3 | 36.9 | 32.1 | 60.3 | 44.7 |
Tab. 1 Comparison of experimental results on Cityscapes dataset
算法 | 速率/(frame·s-1) | mAP/% | AP50 /% | 不同类别的AP/% | |||
---|---|---|---|---|---|---|---|
person | rider | car | bus | ||||
SGN | 0.61 | 25.0 | 44.9 | 21.8 | 20.1 | 39.4 | 33.2 |
Mask R-CNN | 0.72 | 25.7 | 45.2 | 20.6 | 23.2 | 40.5 | 32.7 |
MEInst | 1.34 | 26.4 | 47.1 | 23.7 | 24.9 | 42.7 | 33.1 |
PANet | 2.24 | 26.2 | 49.9 | 30.5 | 23.7 | 46.9 | 32.2 |
CondInst | 0.84 | 31.8 | 57.1 | 36.8 | 30.4 | 54.8 | 36.3 |
Deep snake | 4.25 | 32.4 | 56.9 | 37.0 | 31.9 | 56.8 | 38.6 |
BlendMask | 4.67 | 31.7 | 58.4 | 37.2 | 27.0 | 56.0 | 40.5 |
SOLOv2 | 4.76 | 32.6 | 59.0 | 35.4 | 30.6 | 56.9 | 41.5 |
本文算法 | 4.49 | 35.7 | 62.3 | 36.9 | 32.1 | 60.3 | 44.7 |
算法 | 速率/(frame·s-1) | AP/% | |||
---|---|---|---|---|---|
小目标 | 中目标 | 大目标 | |||
SGN | 10.2 | 35.7 | 19.8 | 38.7 | 47.2 |
Mask R-CNN | 15.3 | 37.5 | 21.1 | 39.6 | 48.3 |
MEInst | 15.0 | 33.5 | 19.3 | 35.7 | 42.1 |
PANet | 14.6 | 32.7 | 20.1 | 36.8 | 44.5 |
CondInst | 15.4 | 37.8 | 21.0 | 40.3 | 48.7 |
Deep snake | 14.2 | 38.0 | 20.8 | 41.8 | 52.3 |
BlendMask | 15.0 | 37.8 | 18.8 | 40.9 | 53.6 |
SOLOv2 | 13.5 | 38.2 | 17.6 | 41.2 | 55.4 |
本文算法 | 12.7 | 40.7 | 21.3 | 43.9 | 57.5 |
Tab. 2 Comparison of experimental results on COCO2017 dataset
算法 | 速率/(frame·s-1) | AP/% | |||
---|---|---|---|---|---|
小目标 | 中目标 | 大目标 | |||
SGN | 10.2 | 35.7 | 19.8 | 38.7 | 47.2 |
Mask R-CNN | 15.3 | 37.5 | 21.1 | 39.6 | 48.3 |
MEInst | 15.0 | 33.5 | 19.3 | 35.7 | 42.1 |
PANet | 14.6 | 32.7 | 20.1 | 36.8 | 44.5 |
CondInst | 15.4 | 37.8 | 21.0 | 40.3 | 48.7 |
Deep snake | 14.2 | 38.0 | 20.8 | 41.8 | 52.3 |
BlendMask | 15.0 | 37.8 | 18.8 | 40.9 | 53.6 |
SOLOv2 | 13.5 | 38.2 | 17.6 | 41.2 | 55.4 |
本文算法 | 12.7 | 40.7 | 21.3 | 43.9 | 57.5 |
算法 | 速率/(frame·s-1) | ||
---|---|---|---|
Baseline | 32.6 | 59.0 | 4.76 |
Baseline+Fast | 34.5 | 61.2 | 4.52 |
Baseline+Fast+Cont | 35.7 | 62.3 | 4.49 |
Tab. 3 Experimental results of module analysis
算法 | 速率/(frame·s-1) | ||
---|---|---|---|
Baseline | 32.6 | 59.0 | 4.76 |
Baseline+Fast | 34.5 | 61.2 | 4.52 |
Baseline+Fast+Cont | 35.7 | 62.3 | 4.49 |
1 | WU D H, LV S C, JIANG M, et al. Using channel pruning-based YOLO v4 deep learning algorithm for the real-time and accurate detection of apple flowers in natural environments[J]. Computers and Electronics in Agriculture, 2020, 178: No.105742. 10.1016/j.compag.2020.105742 |
2 | 马佳良,陈斌,孙晓飞. 基于改进的Faster R-CNN的通用目标检测框架[J]. 计算机应用, 2021, 41(9):2712-2719. 10.11772/j.issn.1001-9081.2020111852 |
MA J L, CHEN B, SUN X F. General object detection framework based on improved Faster R-CNN[J]. Journal of Computer Applications, 2021, 41(9): 2712-2719. 10.11772/j.issn.1001-9081.2020111852 | |
3 | HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 7132-7141. 10.1109/cvpr.2018.00745 |
4 | WANG P Q, CHEN P F, YUAN Y, et al. Understanding convolution for semantic segmentation[C]// Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision. Piscataway: IEEE, 2018: 1451-1460. 10.1109/wacv.2018.00163 |
5 | 杨贞,彭小宝,朱强强,等. 基于Deeplab V3 Plus的自适应注意力机制图像分割算法[J]. 计算机应用, 2022, 42(1):230-238. 10.11772/j.issn.1001-9081.2021010137 |
YANG Z, PENG X B, ZHU Q Q, et al. Image segmentation algorithm with adaptive attention mechanism based on Deeplab V3 Plus[J]. Journal of Computer Applications, 2022, 42(1): 230-238. 10.11772/j.issn.1001-9081.2021010137 | |
6 | WANG X L, KONG T, SHEN C H, et al. SOLO: segmenting objects by locations[C]// Proceedings of the 2020 European Conference on Computer Vision, LNCS 12363. Cham: Springer, 2020: 649-665. |
7 | WANG X L, ZHANG R F, KONG T, et al. SOLOv2: dynamic and fast instance segmentation[C/OL]// Proceedings of the 34th Conference on Neural Information Processing System. [2022-01-23].. |
8 | CORDTS M, OMRAN M, RAMOS S, et al. The Cityscapes dataset for semantic urban scene understanding[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 3213-3223. 10.1109/cvpr.2016.350 |
9 | WU C H, WU F Z, QI T, et al. Fastformer: additive attention can be all you need[EB/OL]. (2021-09-05) [2022-01-23].. |
10 | ZHOU X Y, ZHUO J C, KRÄHENBÜHL P. Bottom-up object detection by grouping extreme and center points[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 850-859. 10.1109/cvpr.2019.00094 |
11 | XIE E Z, SUN P Z, SONG X G, et al. PolarMask: single shot instance segmentation with polar representation[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 12190-12199. 10.1109/cvpr42600.2020.01221 |
12 | RIAZ H U M, BENBARKA N, ZELL A. FourierNet: compact mask representation for instance segmentation using differentiable shape decoders[C]// Proceedings of the 25th International Conference on Pattern Recognition. Piscataway: IEEE, 2021: 7833-7840. 10.1109/icpr48806.2021.9413048 |
13 | BOLYA D, ZHOU C, XIAO F Y, et al. YOLACT: real-time instance segmentation[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 9156-9165. 10.1109/iccv.2019.00925 |
14 | CHEN H, SUN K Y, TIAN Z, et al. BlendMask: top-down meets bottom-up for instance segmentation[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020:8570-8578. 10.1109/cvpr42600.2020.00860 |
15 | TIAN Z, SHEN C H, CHEN H. Conditional convolutions for instance segmentation[C]// Proceedings of the 2020 European Conference on Computer Vision, LNCS 12346. Cham: Springer, 2020: 282-298. |
16 | CARION N, MASSA F, SYNNAEVE G, et al. End-to-end object detection with transformers[C]// Proceedings of the 2020 European Conference on Computer Vision, LNCS 12346. Cham: Springer, 2020: 213-229. |
17 | HU J, CAO L J, LU Y, et al. ISTR: end-to-end instance segmentation with Transformers[EB/OL]. (2021-05-06) [2022-01-24].. 10.1109/cvpr46437.2021.00863 |
18 | DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: Transformers for image recognition at scale[EB/OL]. (2021-06-03) [2022-01-24].. |
19 | TOUVRON H, CORD M, DOUZE M, et al. Training data-efficient image transformers & distillation through attention[C]// Proceedings of the 38th International Conference on Machine Learning. New York: JMLR.org, 2021: 10347-10357. 10.1109/iccv48922.2021.00091 |
20 | HAN K, XIAO A, WU E H, et al. Transformer in Transformer[C/OL]// Proceedings of the 35th Conference on Neural Information Processing Systems. [2022-01-26].. |
21 | TIAN Y X, NEWSAM S, BOAKYE K. Image search with text feedback by additive attention compositional learning[EB/OL]. (2022-03-08) [2022-03-20].. 10.1109/wacv56688.2023.00107 |
22 | KIM S W, MIN J, CHO M. Visual TransforMatcher: efficient match-to-match attention for visual correspondence[EB/OL]. (2021-10-06) [2022-01-29].. 10.1109/cvpr52688.2022.00850 |
23 | HONG S, CHO S, NAM J, et al. Cost aggregation is all you need for few-shot segmentation[EB/OL]. (2021-12-22) [2022-01-24].. 10.1007/978-3-031-19818-2_7 |
24 | HE K M, FAN H Q, WU Y X, et al. Momentum contrast for unsupervised visual representation learning[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020:9726-9735. 10.1109/cvpr42600.2020.00975 |
25 | CHEN T, KORNBLITH S, NOROUZI M, et al. A simple framework for contrastive learning of visual representations[C]// Proceedings of the 37th International Conference on Machine Learning. New York: JMLR.org, 2020: 1597-1607. |
26 | CHEN X L, FAN H Q, GIRSHICK R, et al. Improved baselines with momentum contrastive learning[EB/OL]. (2020-03-09) [2022-01-24].. |
27 | CHEN X L, HE K M. Exploring simple Siamese representation learning[C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 15745-15753. 10.1109/cvpr46437.2021.01549 |
28 | DENG Z L, ZHONG Y J, GUO S, et al. InsCLR: improving instance retrieval with self-supervision[C]// Proceedings of the 36th AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2022: 516-524. 10.1609/aaai.v36i1.19930 |
29 | LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: common objects in context[C]// Proceedings of the 2014 European Conference on Computer Vision, LNCS 8693. Cham: Springer, 2014: 740-755. |
30 | LIU S, JIA J Y, FIDLER S, et al. SGN: sequential grouping networks for instance segmentation[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 3516-3524. 10.1109/iccv.2017.378 |
31 | HE K M, GKIOXARI G, DOLLÁR P, et al. Mask R-CNN[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 2980-2988. 10.1109/iccv.2017.322 |
32 | LIU S, QI L, QIN H F, et al. Path aggregation network for instance segmentation[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 8759-8768. 10.1109/cvpr.2018.00913 |
33 | ZHANG R F, TIAN Z, SHEN C H, et al. Mask encoding for single shot instance segmentation[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 10223-10232. 10.1109/cvpr42600.2020.01024 |
34 | PENG S D, JIANG W, PI H J, et al. Deep snake for real-time instance segmentation[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 8530-8539. 10.1109/cvpr42600.2020.00856 |
[1] | Xin YANG, Xueni CHEN, Chunjiang WU, Shijie ZHOU. Short-term traffic flow prediction of urban highway based on variant residual model and Transformer [J]. Journal of Computer Applications, 2024, 44(9): 2947-2951. |
[2] | Shuai FU, Xiaoying GUO, Ruyi BAI, Tao YAN, Bin CHEN. Age estimation method combining improved CloFormer model and ordinal regression [J]. Journal of Computer Applications, 2024, 44(8): 2372-2380. |
[3] | Tong CHEN, Fengyu YANG, Yu XIONG, Hong YAN, Fuxing QIU. Construction method of voiceprint library based on multi-scale frequency-channel attention fusion [J]. Journal of Computer Applications, 2024, 44(8): 2407-2413. |
[4] | Wudan LONG, Bo PENG, Jie HU, Ying SHEN, Danni DING. Road damage detection algorithm based on enhanced feature extraction [J]. Journal of Computer Applications, 2024, 44(7): 2264-2270. |
[5] | Ruihua LIU, Zihe HAO, Yangyang ZOU. Gait recognition algorithm based on multi-layer refined feature fusion [J]. Journal of Computer Applications, 2024, 44(7): 2250-2257. |
[6] | Xun SUN, Ruifeng FENG, Yanru CHEN. Monocular 3D object detection method integrating depth and instance segmentation [J]. Journal of Computer Applications, 2024, 44(7): 2208-2215. |
[7] | Zhihao WU, Ziqiu CHI, Ting XIAO, Zhe WANG. Meta-learning adaption for few-shot text-to-speech [J]. Journal of Computer Applications, 2024, 44(5): 1629-1635. |
[8] | Cunyi LIAO, Yi ZHENG, Weijin LIU, Huan YU, Shouyin LIU. Decoupling-fusing algorithm for multiple tasks with autonomous driving environment perception [J]. Journal of Computer Applications, 2024, 44(2): 424-431. |
[9] | Chenhui CUI, Suzhen LIN, Dawei LI, Xiaofei LU, Jie WU. Infrared dim small target tracking method based on Siamese network and Transformer [J]. Journal of Computer Applications, 2024, 44(2): 563-571. |
[10] | Wenjie YAN, Dongyue DANG. Broad quantum state tomography model based on adaptive feature extraction [J]. Journal of Computer Applications, 2024, 44(12): 3861-3866. |
[11] | Tao LIU, Shihong JU, Yimeng GAO. Small object detection algorithm from drone perspective based on improved YOLOv8n [J]. Journal of Computer Applications, 2024, 44(11): 3603-3609. |
[12] | Yiyang FAN, Yang ZHANG, Shang ZENG, Yu ZENG, Maoli FU. Multivariate long-term series forecasting model based on decomposition and frequency domain feature extraction [J]. Journal of Computer Applications, 2024, 44(11): 3442-3448. |
[13] | Pei ZHAO, Yan QIAO, Rongyao HU, Xinyu YUAN, Minyue LI, Benchu ZHANG. Multivariate time series anomaly detection based on multi-domain feature extraction [J]. Journal of Computer Applications, 2024, 44(11): 3419-3426. |
[14] | Xiaoyu HUA, Dongfen LI, You FU, Kejun BI, Shi YING, Ruijin WANG. Industrial chain risk assessment and early warning model combining hierarchical graph neural network and long short-term memory [J]. Journal of Computer Applications, 2024, 44(10): 3223-3231. |
[15] | Yuning ZHANG, Abudukelimu ABULIZI, Tisheng MEI, Chun XU, Maierdana MAIMAITIREYIMU, Halidanmu ABUDUKELIMU, Yutao HOU. Anomaly detection method for skeletal X-ray images based on self-supervised feature extraction [J]. Journal of Computer Applications, 2024, 44(1): 175-181. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||