《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (12): 3933-3940.DOI: 10.11772/j.issn.1001-9081.2022111687
收稿日期:
2022-11-10
修回日期:
2023-05-23
接受日期:
2023-05-29
发布日期:
2023-07-26
出版日期:
2023-12-10
通讯作者:
伍鹏
作者简介:
刘磊(2002—),男,山东青岛人,主要研究方向:图像处理、人工智能基金资助:
Lei LIU1, Peng WU1(), Kai XIE1,2, Beizhi CHENG1, Guanqun SHENG3
Received:
2022-11-10
Revised:
2023-05-23
Accepted:
2023-05-29
Online:
2023-07-26
Published:
2023-12-10
Contact:
Peng WU
About author:
LIU Lei, born in 2002. His research interests include image processing,artificial intelligence.Supported by:
摘要:
针对智能车位管理系统中,光照变化、车位遮挡等因素导致车位预测的精度下降、有效性变差的问题,提出一种自监督学习方向梯度直方图(HOG)预测辅助任务下的车位检测方法。首先,设计预测图像遮挡部分HOG特征的自监督学习辅助任务,利用MobileViTBlock(light-weight, general-purpose, and Mobile-friendly Vision Transformer Block)综合图像全局信息,使模型更充分地学习图像的视觉表征,并提高模型的特征提取能力;其次,改进SE(Squeeze-and-Excitation)注意力机制,使模型在更低的计算开销上达到甚至高于原始SE注意力机制的效果;最后,将辅助任务训练的特征提取部分应用于下游的分类任务进行车位状态预测,在PKLot和CNRPark的混合数据集上进行实验。实验结果表明,所提模型在测试集上的准确率达到了97.49%,相较于RepVGG,遮挡预测准确率提高了5.46个百分点,与其他的车位检测算法相比进步较大。
中图分类号:
刘磊, 伍鹏, 谢凯, 程贝芝, 盛冠群. 自监督学习HOG预测辅助任务下的车位检测方法[J]. 计算机应用, 2023, 43(12): 3933-3940.
Lei LIU, Peng WU, Kai XIE, Beizhi CHENG, Guanqun SHENG. Parking space detection method based on self-supervised learning HOG prediction auxiliary task[J]. Journal of Computer Applications, 2023, 43(12): 3933-3940.
配置项 | 详情 |
---|---|
CPU | AMD Ryzen 7 5800H with Radeon Graphics |
内存 | 16 GB |
GPU | NVIDIA GeForce RTX 3060 Laptop GPU |
操作系统 | Windows 10 |
软件 | Python 3.8,OpenCV,PyTorch 1.11.0+cu115 |
表1 计算机软硬件配置
Tab. 1 Computer hardware and software configuration
配置项 | 详情 |
---|---|
CPU | AMD Ryzen 7 5800H with Radeon Graphics |
内存 | 16 GB |
GPU | NVIDIA GeForce RTX 3060 Laptop GPU |
操作系统 | Windows 10 |
软件 | Python 3.8,OpenCV,PyTorch 1.11.0+cu115 |
模型 | C1 | C2 | ||
---|---|---|---|---|
准确率 | F1分数 | 准确率 | F1分数 | |
模型1 | 97.54 | 97.44 | 97.99 | 98.16 |
模型2 | 99.24 | 99.22 | 98.09 | 98.23 |
表2 两种模型在C1和C2条件下的检测性能对比 (%)
Tab. 2 Comparison of detection performance of two models under C1 and C2 conditions
模型 | C1 | C2 | ||
---|---|---|---|---|
准确率 | F1分数 | 准确率 | F1分数 | |
模型1 | 97.54 | 97.44 | 97.99 | 98.16 |
模型2 | 99.24 | 99.22 | 98.09 | 98.23 |
注意力 | 测试方法 | 准确率/% | F1分数/% | 参数量/103 | 计算量/103 |
---|---|---|---|---|---|
SE | C1 | 98.47 | 97.58 | 412.672 | 431 447 |
C2 | 97.72 | 97.91 | |||
CSE | C1 | 99.24 | 99.22 | 404.382 | 431 442 |
C2 | 98.09 | 98.23 |
表3 C1和C2条件下两种注意力机制应用在本文模型上的效果对比
Tab.3 Comparison of effects of two attention mechanisms applying on proposed model under C1 and C2 conditions
注意力 | 测试方法 | 准确率/% | F1分数/% | 参数量/103 | 计算量/103 |
---|---|---|---|---|---|
SE | C1 | 98.47 | 97.58 | 412.672 | 431 447 |
C2 | 97.72 | 97.91 | |||
CSE | C1 | 99.24 | 99.22 | 404.382 | 431 442 |
C2 | 98.09 | 98.23 |
测试方法 | 模型 | 准确率 | 精确度 | 召回率 | F1分数 |
---|---|---|---|---|---|
C1 | VGG16 | 96.35 | 99.56 | 92.93 | 96.12 |
ResNet18 | 96.08 | 99.80 | 92.15 | 95.82 | |
MCNN | 96.33 | 98.72 | 93.63 | 96.11 | |
RepVGG | 97.63 | 98.64 | 96.46 | 97.53 | |
mAlexNet | 96.53 | 99.07 | 94.34 | 96.64 | |
本文模型 | 99.24 | 99.67 | 98.78 | 99.22 | |
C2 | VGG16 | 97.21 | 98.29 | 96.55 | 97.41 |
ResNet18 | 97.81 | 99.28 | 96.67 | 97.95 | |
MCNN | 97.76 | 98.59 | 96.68 | 97.62 | |
RepVGG | 97.82 | 98.15 | 97.82 | 97.98 | |
mAlexNet | 97.29 | 97.60 | 97.21 | 97.40 | |
本文模型 | 98.09 | 98.87 | 97.60 | 98.23 |
表4 不同模型在C1和C2条件下的检测性能对比 (%)
Tab.4 Comparison of detection performance of different models under C1 and C2 conditions
测试方法 | 模型 | 准确率 | 精确度 | 召回率 | F1分数 |
---|---|---|---|---|---|
C1 | VGG16 | 96.35 | 99.56 | 92.93 | 96.12 |
ResNet18 | 96.08 | 99.80 | 92.15 | 95.82 | |
MCNN | 96.33 | 98.72 | 93.63 | 96.11 | |
RepVGG | 97.63 | 98.64 | 96.46 | 97.53 | |
mAlexNet | 96.53 | 99.07 | 94.34 | 96.64 | |
本文模型 | 99.24 | 99.67 | 98.78 | 99.22 | |
C2 | VGG16 | 97.21 | 98.29 | 96.55 | 97.41 |
ResNet18 | 97.81 | 99.28 | 96.67 | 97.95 | |
MCNN | 97.76 | 98.59 | 96.68 | 97.62 | |
RepVGG | 97.82 | 98.15 | 97.82 | 97.98 | |
mAlexNet | 97.29 | 97.60 | 97.21 | 97.40 | |
本文模型 | 98.09 | 98.87 | 97.60 | 98.23 |
模型 | 准确率 | 轻度遮挡 | 中度遮挡 | 重度遮挡 |
---|---|---|---|---|
MCNN | 89.93 | 93.24 | 82.33 | 75.15 |
mAlexNet | 88.05 | 90.43 | 84.53 | 77.41 |
RepVGG | 92.03 | 95.63 | 89.58 | 84.48 |
本文模型 | 97.49 | 97.96 | 92.28 | 88.63 |
表5 测试集上不同模型在不同遮挡条件下的准确率对比 (%)
Tab.5 Accuracy comparison of different models underdifferent occlusion conditions on test set
模型 | 准确率 | 轻度遮挡 | 中度遮挡 | 重度遮挡 |
---|---|---|---|---|
MCNN | 89.93 | 93.24 | 82.33 | 75.15 |
mAlexNet | 88.05 | 90.43 | 84.53 | 77.41 |
RepVGG | 92.03 | 95.63 | 89.58 | 84.48 |
本文模型 | 97.49 | 97.96 | 92.28 | 88.63 |
模型 | 准确率 | 精确度 | 召回率 | F1分数 |
---|---|---|---|---|
MCNN | 75.03 | 75.49 | 80.95 | 78.12 |
RepVGG | 79.83 | 81.16 | 83.65 | 82.38 |
mAlexNet | 78.95 | 78.91 | 83.37 | 81.18 |
本文模型 | 82.54 | 82.19 | 87.26 | 84.64 |
表6 不同模型在LC数据集上的检测性能对比 (%)
Tab.6 Comparison of detection performance of different models on LC dataset
模型 | 准确率 | 精确度 | 召回率 | F1分数 |
---|---|---|---|---|
MCNN | 75.03 | 75.49 | 80.95 | 78.12 |
RepVGG | 79.83 | 81.16 | 83.65 | 82.38 |
mAlexNet | 78.95 | 78.91 | 83.37 | 81.18 |
本文模型 | 82.54 | 82.19 | 87.26 | 84.64 |
模型 | 计算量/106 | 参数量/104 | 推理时间/ms |
---|---|---|---|
VGG16 | 15 483.86 | 13 836 | 6.49 |
ResNet18 | 1 819.07 | 1 169 | 2.90 |
MCNN | 24.40 | 6 | 1.84 |
RepVGG | 1 362.03 | 703 | 2.69 |
mAlexNet | 21.22 | 3 | 1.27 |
本文模型 | 431.44 | 40 | 3.96 |
表7 不同模型的计算量、参数量和推理时间对比
Tab.7 Comparisons of computational cost, parameter number and reasoning time of different models
模型 | 计算量/106 | 参数量/104 | 推理时间/ms |
---|---|---|---|
VGG16 | 15 483.86 | 13 836 | 6.49 |
ResNet18 | 1 819.07 | 1 169 | 2.90 |
MCNN | 24.40 | 6 | 1.84 |
RepVGG | 1 362.03 | 703 | 2.69 |
mAlexNet | 21.22 | 3 | 1.27 |
本文模型 | 431.44 | 40 | 3.96 |
1 | DE ALMEIDA P R L, OLIVEIRA L S, BRITTO A C, Jr, et al. PKLot — a robust dataset for parking lot classification[J]. Expert Systems with Applications, 2015, 42(11):4937-4949. 10.1016/j.eswa.2015.02.009 |
2 | 黄伟杰,张希,赵柏暄,等. 基于视觉的停车场车位检测与分类算法[J]. 计算机系统应用, 2022, 31(3):234-240. |
HUANG W J, ZHANG X, ZHAO B X, et al. Vision-based parking space detection and classification algorithm[J]. Computer Systems and Applications, 2022, 31(3):234-240. | |
3 | JERMSURAWONG J, AHSAN U, HAIDAR A, et al. One-day long statistical analysis of parking demand by using single-camera vacancy detection[J]. Journal of Transportation Systems Engineering and Information Technology, 2013, 14(2): 33-44. 10.1016/s1570-6672(13)60136-1 |
4 | 安旭骁,邓洪敏,史兴宇. 基于迷你卷积神经网络的停车场空车位检测方法[J]. 计算机应用, 2018, 38(4): 935-938. 10.11772/j.issn.1001-9081.2017092362 |
AN X X, DENG H M, SHI X Y. Parking lot space detection method based on mini convolutional neural network[J]. Journal of Computer Applications, 2018, 38(4): 935-938. 10.11772/j.issn.1001-9081.2017092362 | |
5 | AMATO G, CARRARA F, FALCHI F, et al. Deep learning for decentralized parking lot occupancy detection [J]. Expert Systems with Applications, 2017, 72: 327-334. 10.1016/j.eswa.2016.10.055 |
6 | KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[C]// Proceedings of the 25th International Conference on Neural Information Processing Systems — Volume 1. Red Hook, NY: Curran Associates Inc., 2012: 1097-1105. |
7 | 申铉京,刘同壮,王玉,等. 基于卷积网络结构重参数化的车位状态检测算法[J]. 吉林大学学报(工学版), 2022, 52(12): 2898-2905. |
SHEN X J, LIU T Z, WANG Y, et al. Detection algorithm for parking space status based on of convolution network structural re-parameterization[J]. Journal of Jilin University (Engineering and Technology Edition), 2022, 52(12): 2898-2905. | |
8 | DING X, ZHANG X, MA N, et al. RepVGG: making VGG-style ConvNets great again[C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 13728-13737. 10.1109/cvpr46437.2021.01352 |
9 | SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition [EB/OL]. (2015-04-10) [2021-04-20].. |
10 | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need [C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook, NY: Curran Associates Inc., 2017: 6000-6010. |
11 | DALAL N, TRIGGS B. Histograms of oriented gradients for human detection[C]// Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition — Volume 1. Piscataway: IEEE, 2005: 886-893. 10.1109/cvpr.2005.4 |
12 | HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 7132-7141. 10.1109/cvpr.2018.00745 |
13 | HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition [C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 770-778. 10.1109/cvpr.2016.90 |
14 | HOWARD A G, ZHU M, CHEN B, et al. MobileNets: efficient convolutional neural networks for mobile vision applications [EB/OL]. (2017-04-17) [2021-06-05].. 10.48550/arXiv.1704.04861 |
15 | SANDLER M, HOEARD A, ZHU M, et al. Inverted residuals and linear bottlenecks: mobile networks for classification, detection and segmentation[EB/OL]. (2018-01-13) [2021-06-20]. . 10.1109/cvpr.2018.00474 |
16 | HOWARD A, SANDLER M, CHEN B, et al. Searching for MobileNetV3[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 1314-1324. 10.1109/iccv.2019.00140 |
17 | SZEGEDY C, VANHOUCKE V, IOFFE S, et al. Rethinking the inception architecture for computer vision [C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016:2818-2826. 10.1109/cvpr.2016.308 |
18 | HE K, FAN H, WU Y, et al. Momentum contrast for unsupervised visual representation learning [C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 9726-9735. 10.1109/cvpr42600.2020.00975 |
19 | CHEN T, KORNBLITH S, NOROUZI M, et al. A simple framework for contrastive learning of visual representations [C]// Proceedings of the 37th International Conference on Machine Learning. New York: JMLR.org, 2020: 1597-1607. |
20 | DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding [C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long and Short Papers). Stroudsburg, PA, ACL, 2019: 4171-4186. 10.18653/v1/n18-2 |
21 | HE K, CHEN X, XIE S, et al. Masked autoencoders are scalable vision learners[C]// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 15979-15988. 10.1109/cvpr52688.2022.01553 |
22 | WEI C, FAN H, XIE S, et al. Masked feature prediction for self-supervised visual pre-training [C]// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 14648-14658. 10.1109/cvpr52688.2022.01426 |
23 | DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16×16 words: Transformers for image recognition at scale[EB/OL]. (2021-06-03) [2022-03-21].. |
24 | MEHTA S, RASTEGARI M. MobileViT: light-weight, general-purpose, and mobile-friendly vision transformer [EB/OL]. (2022-03-04) [2022-05-16].. 10.1109/cvpr.2019.00941 |
25 | TAN M, LE Q. EfficientNetV2: smaller models and faster training[C]// Proceedings of the 38th International Conference on Machine Learning. New York: JMLR.org, 2021: 10096-10106. |
26 | CUI C, GAO T, WEI S, et al. PP-LCNet: a lightweight CPU convolutional neural network[EB/OL]. (2021-09-17) [2022-05-12].. |
[1] | 马胜位, 黄瑞章, 任丽娜, 林川. 基于多层语义融合的结构化深度文本聚类模型[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2364-2369. |
[2] | 柏财通, 崔翛龙, 郑会吉, 李爱. 基于自监督知识迁移的鲁棒性语音识别技术[J]. 《计算机应用》唯一官方网站, 2022, 42(10): 3217-3223. |
[3] | 代雨柔, 杨庆, 张凤荔, 周帆. 基于自监督学习的社交网络用户轨迹预测模型[J]. 计算机应用, 2021, 41(9): 2545-2551. |
[4] | 吴崇数, 林霖, 薛蕴菁, 时鹏. 基于自监督学习的病理图像层次分割[J]. 计算机应用, 2020, 40(6): 1856-1862. |
[5] | 陈立潮, 张雷, 曹建芳, 张睿. 梯度直方图卷积特征的胶囊网络在交通监控下的车型分类[J]. 计算机应用, 2020, 40(10): 2881-2889. |
[6] | 姬晓飞, 左鑫孟. 基于关键帧特征库统计特征的双人交互行为识别[J]. 计算机应用, 2016, 36(8): 2287-2291. |
[7] | 徐海宁, 陈恩庆, 梁成武. 三维动作识别时空特征提取方法[J]. 计算机应用, 2016, 36(2): 568-573. |
[8] | 朱苏阳, 惠浩添, 钱龙华, 张民. 基于自监督学习的维基百科家庭关系抽取[J]. 计算机应用, 2015, 35(4): 1013-1016. |
[9] | 曾伟 朱桂斌 陈杰 唐丁丁. 多特征融合的鲁棒粒子滤波跟踪算法[J]. 计算机应用, 2010, 30(3): 643-645. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||