Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (4): 1062-1070.DOI: 10.11772/j.issn.1001-9081.2022020270
Special Issue: 人工智能
• Artificial intelligence • Previous Articles Next Articles
					
						                                                                                                                                                                                                                                                    Rong GAO1,2( ), Jiawei SHEN1, Xiongkai SHAO1, Xinyun WU1
), Jiawei SHEN1, Xiongkai SHAO1, Xinyun WU1
												  
						
						
						
					
				
Received:2022-03-09
															
							
																	Revised:2022-05-20
															
							
																	Accepted:2022-05-20
															
							
							
																	Online:2022-08-16
															
							
																	Published:2023-04-10
															
							
						Contact:
								Rong GAO   
													About author:SHEN Jiawei, born in 1998, M. S. candidate. His research interests include object detection, instance segmentation.Supported by:通讯作者:
					高榕
							作者简介:沈加伟(1998—),男,湖北黄冈人,硕士研究生,主要研究方向:目标检测、实例分割;基金资助:CLC Number:
Rong GAO, Jiawei SHEN, Xiongkai SHAO, Xinyun WU. Instance segmentation algorithm based on Fastformer and self-supervised contrastive learning[J]. Journal of Computer Applications, 2023, 43(4): 1062-1070.
高榕, 沈加伟, 邵雄凯, 吴歆韵. 基于Fastformer和自监督对比学习的实例分割算法[J]. 《计算机应用》唯一官方网站, 2023, 43(4): 1062-1070.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2022020270
| 算法 | 速率/(frame·s-1) | mAP/% | AP50 /% | 不同类别的AP/% | |||
|---|---|---|---|---|---|---|---|
| person | rider | car | bus | ||||
| SGN | 0.61 | 25.0 | 44.9 | 21.8 | 20.1 | 39.4 | 33.2 | 
| Mask R-CNN | 0.72 | 25.7 | 45.2 | 20.6 | 23.2 | 40.5 | 32.7 | 
| MEInst | 1.34 | 26.4 | 47.1 | 23.7 | 24.9 | 42.7 | 33.1 | 
| PANet | 2.24 | 26.2 | 49.9 | 30.5 | 23.7 | 46.9 | 32.2 | 
| CondInst | 0.84 | 31.8 | 57.1 | 36.8 | 30.4 | 54.8 | 36.3 | 
| Deep snake | 4.25 | 32.4 | 56.9 | 37.0 | 31.9 | 56.8 | 38.6 | 
| BlendMask | 4.67 | 31.7 | 58.4 | 37.2 | 27.0 | 56.0 | 40.5 | 
| SOLOv2 | 4.76 | 32.6 | 59.0 | 35.4 | 30.6 | 56.9 | 41.5 | 
| 本文算法 | 4.49 | 35.7 | 62.3 | 36.9 | 32.1 | 60.3 | 44.7 | 
Tab. 1 Comparison of experimental results on Cityscapes dataset
| 算法 | 速率/(frame·s-1) | mAP/% | AP50 /% | 不同类别的AP/% | |||
|---|---|---|---|---|---|---|---|
| person | rider | car | bus | ||||
| SGN | 0.61 | 25.0 | 44.9 | 21.8 | 20.1 | 39.4 | 33.2 | 
| Mask R-CNN | 0.72 | 25.7 | 45.2 | 20.6 | 23.2 | 40.5 | 32.7 | 
| MEInst | 1.34 | 26.4 | 47.1 | 23.7 | 24.9 | 42.7 | 33.1 | 
| PANet | 2.24 | 26.2 | 49.9 | 30.5 | 23.7 | 46.9 | 32.2 | 
| CondInst | 0.84 | 31.8 | 57.1 | 36.8 | 30.4 | 54.8 | 36.3 | 
| Deep snake | 4.25 | 32.4 | 56.9 | 37.0 | 31.9 | 56.8 | 38.6 | 
| BlendMask | 4.67 | 31.7 | 58.4 | 37.2 | 27.0 | 56.0 | 40.5 | 
| SOLOv2 | 4.76 | 32.6 | 59.0 | 35.4 | 30.6 | 56.9 | 41.5 | 
| 本文算法 | 4.49 | 35.7 | 62.3 | 36.9 | 32.1 | 60.3 | 44.7 | 
| 算法 | 速率/(frame·s-1) | AP/% | |||
|---|---|---|---|---|---|
| 小目标 | 中目标 | 大目标 | |||
| SGN | 10.2 | 35.7 | 19.8 | 38.7 | 47.2 | 
| Mask R-CNN | 15.3 | 37.5 | 21.1 | 39.6 | 48.3 | 
| MEInst | 15.0 | 33.5 | 19.3 | 35.7 | 42.1 | 
| PANet | 14.6 | 32.7 | 20.1 | 36.8 | 44.5 | 
| CondInst | 15.4 | 37.8 | 21.0 | 40.3 | 48.7 | 
| Deep snake | 14.2 | 38.0 | 20.8 | 41.8 | 52.3 | 
| BlendMask | 15.0 | 37.8 | 18.8 | 40.9 | 53.6 | 
| SOLOv2 | 13.5 | 38.2 | 17.6 | 41.2 | 55.4 | 
| 本文算法 | 12.7 | 40.7 | 21.3 | 43.9 | 57.5 | 
Tab. 2 Comparison of experimental results on COCO2017 dataset
| 算法 | 速率/(frame·s-1) | AP/% | |||
|---|---|---|---|---|---|
| 小目标 | 中目标 | 大目标 | |||
| SGN | 10.2 | 35.7 | 19.8 | 38.7 | 47.2 | 
| Mask R-CNN | 15.3 | 37.5 | 21.1 | 39.6 | 48.3 | 
| MEInst | 15.0 | 33.5 | 19.3 | 35.7 | 42.1 | 
| PANet | 14.6 | 32.7 | 20.1 | 36.8 | 44.5 | 
| CondInst | 15.4 | 37.8 | 21.0 | 40.3 | 48.7 | 
| Deep snake | 14.2 | 38.0 | 20.8 | 41.8 | 52.3 | 
| BlendMask | 15.0 | 37.8 | 18.8 | 40.9 | 53.6 | 
| SOLOv2 | 13.5 | 38.2 | 17.6 | 41.2 | 55.4 | 
| 本文算法 | 12.7 | 40.7 | 21.3 | 43.9 | 57.5 | 
| 算法 | 速率/(frame·s-1) | ||
|---|---|---|---|
| Baseline | 32.6 | 59.0 | 4.76 | 
| Baseline+Fast | 34.5 | 61.2 | 4.52 | 
| Baseline+Fast+Cont | 35.7 | 62.3 | 4.49 | 
Tab. 3 Experimental results of module analysis
| 算法 | 速率/(frame·s-1) | ||
|---|---|---|---|
| Baseline | 32.6 | 59.0 | 4.76 | 
| Baseline+Fast | 34.5 | 61.2 | 4.52 | 
| Baseline+Fast+Cont | 35.7 | 62.3 | 4.49 | 
| 1 | WU D H, LV S C, JIANG M, et al. Using channel pruning-based YOLO v4 deep learning algorithm for the real-time and accurate detection of apple flowers in natural environments[J]. Computers and Electronics in Agriculture, 2020, 178: No.105742. 10.1016/j.compag.2020.105742 | 
| 2 | 马佳良,陈斌,孙晓飞. 基于改进的Faster R-CNN的通用目标检测框架[J]. 计算机应用, 2021, 41(9):2712-2719. 10.11772/j.issn.1001-9081.2020111852 | 
| MA J L, CHEN B, SUN X F. General object detection framework based on improved Faster R-CNN[J]. Journal of Computer Applications, 2021, 41(9): 2712-2719. 10.11772/j.issn.1001-9081.2020111852 | |
| 3 | HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 7132-7141. 10.1109/cvpr.2018.00745 | 
| 4 | WANG P Q, CHEN P F, YUAN Y, et al. Understanding convolution for semantic segmentation[C]// Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision. Piscataway: IEEE, 2018: 1451-1460. 10.1109/wacv.2018.00163 | 
| 5 | 杨贞,彭小宝,朱强强,等. 基于Deeplab V3 Plus的自适应注意力机制图像分割算法[J]. 计算机应用, 2022, 42(1):230-238. 10.11772/j.issn.1001-9081.2021010137 | 
| YANG Z, PENG X B, ZHU Q Q, et al. Image segmentation algorithm with adaptive attention mechanism based on Deeplab V3 Plus[J]. Journal of Computer Applications, 2022, 42(1): 230-238. 10.11772/j.issn.1001-9081.2021010137 | |
| 6 | WANG X L, KONG T, SHEN C H, et al. SOLO: segmenting objects by locations[C]// Proceedings of the 2020 European Conference on Computer Vision, LNCS 12363. Cham: Springer, 2020: 649-665. | 
| 7 | WANG X L, ZHANG R F, KONG T, et al. SOLOv2: dynamic and fast instance segmentation[C/OL]// Proceedings of the 34th Conference on Neural Information Processing System. [2022-01-23].. | 
| 8 | CORDTS M, OMRAN M, RAMOS S, et al. The Cityscapes dataset for semantic urban scene understanding[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 3213-3223. 10.1109/cvpr.2016.350 | 
| 9 | WU C H, WU F Z, QI T, et al. Fastformer: additive attention can be all you need[EB/OL]. (2021-09-05) [2022-01-23].. | 
| 10 | ZHOU X Y, ZHUO J C, KRÄHENBÜHL P. Bottom-up object detection by grouping extreme and center points[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 850-859. 10.1109/cvpr.2019.00094 | 
| 11 | XIE E Z, SUN P Z, SONG X G, et al. PolarMask: single shot instance segmentation with polar representation[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 12190-12199. 10.1109/cvpr42600.2020.01221 | 
| 12 | RIAZ H U M, BENBARKA N, ZELL A. FourierNet: compact mask representation for instance segmentation using differentiable shape decoders[C]// Proceedings of the 25th International Conference on Pattern Recognition. Piscataway: IEEE, 2021: 7833-7840. 10.1109/icpr48806.2021.9413048 | 
| 13 | BOLYA D, ZHOU C, XIAO F Y, et al. YOLACT: real-time instance segmentation[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 9156-9165. 10.1109/iccv.2019.00925 | 
| 14 | CHEN H, SUN K Y, TIAN Z, et al. BlendMask: top-down meets bottom-up for instance segmentation[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020:8570-8578. 10.1109/cvpr42600.2020.00860 | 
| 15 | TIAN Z, SHEN C H, CHEN H. Conditional convolutions for instance segmentation[C]// Proceedings of the 2020 European Conference on Computer Vision, LNCS 12346. Cham: Springer, 2020: 282-298. | 
| 16 | CARION N, MASSA F, SYNNAEVE G, et al. End-to-end object detection with transformers[C]// Proceedings of the 2020 European Conference on Computer Vision, LNCS 12346. Cham: Springer, 2020: 213-229. | 
| 17 | HU J, CAO L J, LU Y, et al. ISTR: end-to-end instance segmentation with Transformers[EB/OL]. (2021-05-06) [2022-01-24].. 10.1109/cvpr46437.2021.00863 | 
| 18 | DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: Transformers for image recognition at scale[EB/OL]. (2021-06-03) [2022-01-24].. | 
| 19 | TOUVRON H, CORD M, DOUZE M, et al. Training data-efficient image transformers & distillation through attention[C]// Proceedings of the 38th International Conference on Machine Learning. New York: JMLR.org, 2021: 10347-10357. 10.1109/iccv48922.2021.00091 | 
| 20 | HAN K, XIAO A, WU E H, et al. Transformer in Transformer[C/OL]// Proceedings of the 35th Conference on Neural Information Processing Systems. [2022-01-26].. | 
| 21 | TIAN Y X, NEWSAM S, BOAKYE K. Image search with text feedback by additive attention compositional learning[EB/OL]. (2022-03-08) [2022-03-20].. 10.1109/wacv56688.2023.00107 | 
| 22 | KIM S W, MIN J, CHO M. Visual TransforMatcher: efficient match-to-match attention for visual correspondence[EB/OL]. (2021-10-06) [2022-01-29].. 10.1109/cvpr52688.2022.00850 | 
| 23 | HONG S, CHO S, NAM J, et al. Cost aggregation is all you need for few-shot segmentation[EB/OL]. (2021-12-22) [2022-01-24].. 10.1007/978-3-031-19818-2_7 | 
| 24 | HE K M, FAN H Q, WU Y X, et al. Momentum contrast for unsupervised visual representation learning[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020:9726-9735. 10.1109/cvpr42600.2020.00975 | 
| 25 | CHEN T, KORNBLITH S, NOROUZI M, et al. A simple framework for contrastive learning of visual representations[C]// Proceedings of the 37th International Conference on Machine Learning. New York: JMLR.org, 2020: 1597-1607. | 
| 26 | CHEN X L, FAN H Q, GIRSHICK R, et al. Improved baselines with momentum contrastive learning[EB/OL]. (2020-03-09) [2022-01-24].. | 
| 27 | CHEN X L, HE K M. Exploring simple Siamese representation learning[C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 15745-15753. 10.1109/cvpr46437.2021.01549 | 
| 28 | DENG Z L, ZHONG Y J, GUO S, et al. InsCLR: improving instance retrieval with self-supervision[C]// Proceedings of the 36th AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2022: 516-524. 10.1609/aaai.v36i1.19930 | 
| 29 | LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: common objects in context[C]// Proceedings of the 2014 European Conference on Computer Vision, LNCS 8693. Cham: Springer, 2014: 740-755. | 
| 30 | LIU S, JIA J Y, FIDLER S, et al. SGN: sequential grouping networks for instance segmentation[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 3516-3524. 10.1109/iccv.2017.378 | 
| 31 | HE K M, GKIOXARI G, DOLLÁR P, et al. Mask R-CNN[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 2980-2988. 10.1109/iccv.2017.322 | 
| 32 | LIU S, QI L, QIN H F, et al. Path aggregation network for instance segmentation[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 8759-8768. 10.1109/cvpr.2018.00913 | 
| 33 | ZHANG R F, TIAN Z, SHEN C H, et al. Mask encoding for single shot instance segmentation[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 10223-10232. 10.1109/cvpr42600.2020.01024 | 
| 34 | PENG S D, JIANG W, PI H J, et al. Deep snake for real-time instance segmentation[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 8530-8539. 10.1109/cvpr42600.2020.00856 | 
| [1] | Xin YANG, Xueni CHEN, Chunjiang WU, Shijie ZHOU. Short-term traffic flow prediction of urban highway based on variant residual model and Transformer [J]. Journal of Computer Applications, 2024, 44(9): 2947-2951. | 
| [2] | Shuai FU, Xiaoying GUO, Ruyi BAI, Tao YAN, Bin CHEN. Age estimation method combining improved CloFormer model and ordinal regression [J]. Journal of Computer Applications, 2024, 44(8): 2372-2380. | 
| [3] | Tong CHEN, Fengyu YANG, Yu XIONG, Hong YAN, Fuxing QIU. Construction method of voiceprint library based on multi-scale frequency-channel attention fusion [J]. Journal of Computer Applications, 2024, 44(8): 2407-2413. | 
| [4] | Wudan LONG, Bo PENG, Jie HU, Ying SHEN, Danni DING. Road damage detection algorithm based on enhanced feature extraction [J]. Journal of Computer Applications, 2024, 44(7): 2264-2270. | 
| [5] | Ruihua LIU, Zihe HAO, Yangyang ZOU. Gait recognition algorithm based on multi-layer refined feature fusion [J]. Journal of Computer Applications, 2024, 44(7): 2250-2257. | 
| [6] | Xun SUN, Ruifeng FENG, Yanru CHEN. Monocular 3D object detection method integrating depth and instance segmentation [J]. Journal of Computer Applications, 2024, 44(7): 2208-2215. | 
| [7] | Zhihao WU, Ziqiu CHI, Ting XIAO, Zhe WANG. Meta-learning adaption for few-shot text-to-speech [J]. Journal of Computer Applications, 2024, 44(5): 1629-1635. | 
| [8] | Cunyi LIAO, Yi ZHENG, Weijin LIU, Huan YU, Shouyin LIU. Decoupling-fusing algorithm for multiple tasks with autonomous driving environment perception [J]. Journal of Computer Applications, 2024, 44(2): 424-431. | 
| [9] | Chenhui CUI, Suzhen LIN, Dawei LI, Xiaofei LU, Jie WU. Infrared dim small target tracking method based on Siamese network and Transformer [J]. Journal of Computer Applications, 2024, 44(2): 563-571. | 
| [10] | Wenjie YAN, Dongyue DANG. Broad quantum state tomography model based on adaptive feature extraction [J]. Journal of Computer Applications, 2024, 44(12): 3861-3866. | 
| [11] | Tao LIU, Shihong JU, Yimeng GAO. Small object detection algorithm from drone perspective based on improved YOLOv8n [J]. Journal of Computer Applications, 2024, 44(11): 3603-3609. | 
| [12] | Yiyang FAN, Yang ZHANG, Shang ZENG, Yu ZENG, Maoli FU. Multivariate long-term series forecasting model based on decomposition and frequency domain feature extraction [J]. Journal of Computer Applications, 2024, 44(11): 3442-3448. | 
| [13] | Pei ZHAO, Yan QIAO, Rongyao HU, Xinyu YUAN, Minyue LI, Benchu ZHANG. Multivariate time series anomaly detection based on multi-domain feature extraction [J]. Journal of Computer Applications, 2024, 44(11): 3419-3426. | 
| [14] | Xiaoyu HUA, Dongfen LI, You FU, Kejun BI, Shi YING, Ruijin WANG. Industrial chain risk assessment and early warning model combining hierarchical graph neural network and long short-term memory [J]. Journal of Computer Applications, 2024, 44(10): 3223-3231. | 
| [15] | Yuning ZHANG, Abudukelimu ABULIZI, Tisheng MEI, Chun XU, Maierdana MAIMAITIREYIMU, Halidanmu ABUDUKELIMU, Yutao HOU. Anomaly detection method for skeletal X-ray images based on self-supervised feature extraction [J]. Journal of Computer Applications, 2024, 44(1): 175-181. | 
| Viewed | ||||||
| Full text |  | |||||
| Abstract |  | |||||