《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (7): 2173-2181.DOI: 10.11772/j.issn.1001-9081.2022060810
收稿日期:
2022-06-06
修回日期:
2022-09-07
接受日期:
2022-09-09
发布日期:
2023-07-20
出版日期:
2023-07-10
通讯作者:
李娇
作者简介:
李忠雨(1997—),男,重庆人,硕士研究生,主要研究方向:目标检测;基金资助:
Zhongyu LI1, Haodong SUN1, Jiao LI1,2,3()
Received:
2022-06-06
Revised:
2022-09-07
Accepted:
2022-09-09
Online:
2023-07-20
Published:
2023-07-10
Contact:
Jiao LI
About author:
LI Zhongyu, born in 1997, M. S. candidate. His research interests include object detection.Supported by:
摘要:
针对一般手势识别算法的参数量、计算量和精度难以平衡的问题,提出一种轻量化篮球裁判手势识别算法。该算法在YOLOV5s (You Only Look Once Version 5s)算法的基础上进行重构:首先,用Involution算子替代CSP1_1的卷积算子,以扩大上下文信息捕获范围并减少核冗余;其次,在C3模块后加入协同注意力(CA)机制,以得到更强的手势特征提取能力;然后,用轻量化内容感知上采样算子改进原始上采样模块,并将采样点集中在目标区域而忽略背景部分;最后,利用以SiLU作为激活函数的Ghost-Net进行轻量化剪枝。在自制的篮球裁判手势数据集上的实验结果表明,该轻量化篮球裁判手势识别算法的计算量、参数量和模型大小分别为3.3 GFLOPs、4.0×106和8.5 MB,与YOLOV5s算法相比,分别减少了79%、44%和40%,mAP@0.5为91.7%,在分辨率为1 920×1 280的比赛视频上的检测帧率达到89.3 frame/s,证明该算法能满足低误差、高帧率和轻量化的要求。
中图分类号:
李忠雨, 孙浩东, 李娇. 轻量化篮球裁判手势识别算法[J]. 计算机应用, 2023, 43(7): 2173-2181.
Zhongyu LI, Haodong SUN, Jiao LI. Lightweight gesture recognition algorithm for basketball referee[J]. Journal of Computer Applications, 2023, 43(7): 2173-2181.
Involution引入模块 | 计算量/GFLOPs | mAP@0.5/% |
---|---|---|
CSP1_1 | 5.3 | 90.6 |
CSP1_2 | 6.9 | 89.8 |
CSP1_3 | 16.7 | 89.1 |
CSP1_1、CSP1_2 | 3.0 | 86.4 |
CSP1_1、CSP1_2、CSP1_3 | 3.0 | 85.2 |
表1 在CSP中引入Involution算子的对比实验
Tab. 1 Conparison experiment of introducing Involution operator into CSP
Involution引入模块 | 计算量/GFLOPs | mAP@0.5/% |
---|---|---|
CSP1_1 | 5.3 | 90.6 |
CSP1_2 | 6.9 | 89.8 |
CSP1_3 | 16.7 | 89.1 |
CSP1_1、CSP1_2 | 3.0 | 86.4 |
CSP1_1、CSP1_2、CSP1_3 | 3.0 | 85.2 |
算法 | 参数量/106 | 计算量/GFLOPs | 模型大小/MB | mAP@0.5/% | 精确率/% | 召回率/% | 帧率/(frame·s-1) |
---|---|---|---|---|---|---|---|
YOLOV5s | 7.1 | 16.0 | 14.1 | 93.5 | 94.6 | 88.5 | 103.1 |
YOLOV5s-ghost | 3.7 | 8.3 | 7.8 | 92.2 | 95.5 | 86.9 | 94.3 |
YOLOV5s + Involution | 7.1 | 5.3 | 14.3 | 90.6 | 92.1 | 86.1 | 120.5 |
YOLOV5s + CA | 7.1 | 16.1 | 14.2 | 94.0 | 92.9 | 88.9 | 84.0 |
YOLOV5s + Content-aware | 7.2 | 16.5 | 14.4 | 93.7 | 94.4 | 87.8 | 94.3 |
YOLOV5s-ghost + CA+ Involution + Content-aware | 4.0 | 3.3 | 8.5 | 91.7 | 96.8 | 89.2 | 89.3 |
表2 YOLOV5s改进前后的性能对比
Tab. 2 Comparison of performance of YOLOV5s before and after improvement
算法 | 参数量/106 | 计算量/GFLOPs | 模型大小/MB | mAP@0.5/% | 精确率/% | 召回率/% | 帧率/(frame·s-1) |
---|---|---|---|---|---|---|---|
YOLOV5s | 7.1 | 16.0 | 14.1 | 93.5 | 94.6 | 88.5 | 103.1 |
YOLOV5s-ghost | 3.7 | 8.3 | 7.8 | 92.2 | 95.5 | 86.9 | 94.3 |
YOLOV5s + Involution | 7.1 | 5.3 | 14.3 | 90.6 | 92.1 | 86.1 | 120.5 |
YOLOV5s + CA | 7.1 | 16.1 | 14.2 | 94.0 | 92.9 | 88.9 | 84.0 |
YOLOV5s + Content-aware | 7.2 | 16.5 | 14.4 | 93.7 | 94.4 | 87.8 | 94.3 |
YOLOV5s-ghost + CA+ Involution + Content-aware | 4.0 | 3.3 | 8.5 | 91.7 | 96.8 | 89.2 | 89.3 |
算法 | 手势类型 | mAP@0.5 |
---|---|---|
YOLOV5s | 犯规停止计时钟 | 84.3 |
进攻犯规 | 79.8 | |
YOLOV5s-Involution | 犯规停止计时钟 | 77.6 |
进攻犯规 | 73.2 | |
YOLOV5s-ghost+ CA+Involution+ Content-aware | 犯规停止计时钟 | 79.8 |
进攻犯规 | 78.1 |
表3 犯规停止计时钟手势和进攻犯规手势在三种算法中的mAP@0.5值对比 ( %)
Tab. 3 Comparison of mAP@0.5 values between stop the clock for foul gesture and offensive foul gesture in three algorithms
算法 | 手势类型 | mAP@0.5 |
---|---|---|
YOLOV5s | 犯规停止计时钟 | 84.3 |
进攻犯规 | 79.8 | |
YOLOV5s-Involution | 犯规停止计时钟 | 77.6 |
进攻犯规 | 73.2 | |
YOLOV5s-ghost+ CA+Involution+ Content-aware | 犯规停止计时钟 | 79.8 |
进攻犯规 | 78.1 |
算法 | mAP@0.5 | 精确率 | 召回率 |
---|---|---|---|
YOLOV5s | 97.2 | 96.3 | 89.6 |
本文算法 | 95.2 | 96.8 | 89.3 |
表4 本文算法与YOLOV5s算法在2013 Chalearn数据集上的检测结果对比 ( %)
Tab. 4 Comparison of detection results between YOLOV5s algorithm and the proposed algorithm on 2013 Chalearn gesture dataset
算法 | mAP@0.5 | 精确率 | 召回率 |
---|---|---|---|
YOLOV5s | 97.2 | 96.3 | 89.6 |
本文算法 | 95.2 | 96.8 | 89.3 |
算法 | 模型大小/MB | 计算量/GFLOPs | mAP@0.5/% | 帧率/(frame·s-1) |
---|---|---|---|---|
Faster RCNN | 460.4 | 283.2 | 82.6 | 34.5 |
YOLOV3 | 235.6 | 154.9 | 81.3 | 66.7 |
YOLOV4 | 245.8 | 142.0 | 85.4 | 82.4 |
YOLOV3-Tiny | 34.0 | 5.6 | 71.3 | 76.9 |
YOLOV4-Tiny | 23.5 | 6.9 | 78.8 | 83.3 |
YOLOX-Tiny | 5.1 | 6.45 | 86.3 | 86.9 |
YOLOV5s | 14.1 | 16.0 | 93.5 | 103.1 |
本文算法 | 8.5 | 3.3 | 91.7 | 89.3 |
表5 其他目标检测算法与本文算法的对比实验
Tab. 5 Comparison experiments between other object detection algorithms and the proposed algorithm
算法 | 模型大小/MB | 计算量/GFLOPs | mAP@0.5/% | 帧率/(frame·s-1) |
---|---|---|---|---|
Faster RCNN | 460.4 | 283.2 | 82.6 | 34.5 |
YOLOV3 | 235.6 | 154.9 | 81.3 | 66.7 |
YOLOV4 | 245.8 | 142.0 | 85.4 | 82.4 |
YOLOV3-Tiny | 34.0 | 5.6 | 71.3 | 76.9 |
YOLOV4-Tiny | 23.5 | 6.9 | 78.8 | 83.3 |
YOLOX-Tiny | 5.1 | 6.45 | 86.3 | 86.9 |
YOLOV5s | 14.1 | 16.0 | 93.5 | 103.1 |
本文算法 | 8.5 | 3.3 | 91.7 | 89.3 |
1 | 张亚吉.篮球比赛智能辅助裁判系统设计[J].冰雪体育创新研究, 2021(22): 169-170. |
ZHANG Y J. Design of intelligent assistant referee system for basketball match[J]. Research on Innovation of Ice Snow Sports, 2021(22): 169-170. | |
2 | 曾丽霞,蒋晓,戴传庆.可穿戴设备中手势交互的设计原则[J].包装工程, 2015, 36(20): 135-138, 155. |
ZENG L X, JIANG X, DAI C Q. Design principle of gesture interaction in the wearable device[J]. Packaging Engineering, 2015, 36(20): 135-138, 155. | |
3 | 关然,徐向民,罗雅愉,等.基于计算机视觉的手势检测识别技术[J].计算机应用与软件, 2013, 30(1): 155-159, 164. 10.3969/j.issn.1000-386x.2013.01.038 |
GUAN R, XU X M, LUO Y Y, et al. A computer vision-based gesture detection and recognition technique[J]. Computer Applications and Software, 2013, 30(1): 155-159, 164. 10.3969/j.issn.1000-386x.2013.01.038 | |
4 | GUYON I, ATHITSOS V, JANGYODSUK P, et al. The ChaLearn gesture dataset (CGD 2011) [J]. Machine Vision and Applications, 2014, 25(8): 1929-1951. 10.1007/s00138-014-0596-3 |
5 | PIGOU L, van HERREWEGHE M, DAMBRE J. Gesture and sign language recognition with temporal residual networks [C]// Proceedings of the 2017 IEEE International Conference on Computer Vision Workshops. Piscataway: IEEE, 2017: 3086-3093. 10.1109/iccvw.2017.365 |
6 | LI D X, CHEN Y M, GAO M K, et al. Multimodal gesture recognition using densely connected convolution and BLSTM [C]// Proceedings of the 24th International Conference on Pattern Recognition. Piscataway: IEEE, 2018: 3365-3370. 10.1109/icpr.2018.8545502 |
7 | PAN T Y, TSAI W L, CHANG C Y, et al. A hierarchical hand gesture recognition framework for sports referee training-based EMG and accelerometer sensors[J]. IEEE Transactions on Cybernetics, 2022, 52(5): 3172-3183. 10.1109/tcyb.2020.3007173 |
8 | ŽEMGULYS J, RAUDONIS V, MASKELIŪNAS R, et al. Recognition of basketball referee signals from real-time videos[J]. Journal of Ambient Intelligence and Humanized Computing, 2020, 11(3): 979-991. 10.1007/s12652-019-01209-1 |
9 | JI R. Research on basketball shooting action based on image feature extraction and machine learning[J]. IEEE Access, 2020, 8: 138743-138751. 10.1109/access.2020.3012456 |
10 | NGUYEN N H, PHAN T D T, KIM S H, et al. 3D skeletal joints-based hand gesture spotting and classification[J]. Applied Sciences, 2021, 11(10): No.4689. 10.3390/app11104689 |
11 | 郭紫嫣,韩慧妍,何黎刚,等.基于改进的YOLOV4的手势识别算法及其应用[J].中北大学学报(自然科学版), 2021, 42(3): 223-231. 10.3969/j.issn.1673-3193.2021.03.006 |
GUO Z Y, HAN H Y, HE L G, et al. Gesture recognition algorithm and application based on improved YOLOv4[J]. Journal of North University of China (Natural Science Edition), 2021, 42(3): 223-231. 10.3969/j.issn.1673-3193.2021.03.006 | |
12 | 解迎刚,王全.基于视觉的动态手势识别研究综述[J].计算机工程与应用, 2021, 57(22): 68-77. |
XIE Y G, WANG Q. Summary of dynamic gesture recognition based on vision[J]. Computer Engineering and Applications, 2021, 57(22): 68-77. | |
13 | 田秋红,杨慧敏,梁庆龙,等.视觉动态手势识别综述[J].浙江理工大学学报(自然科学版), 2020, 43(4): 557-569. |
TIAN Q H, YANG H M, LIANG Q L, et al. Overview on vision-based dynamic gesture recognition[J]. Journal of Zhejiang Sci-Tech University (Natural Sciences Edition), 2020, 43(4): 557-569. | |
14 | 辛文斌,郝惠敏,卜明龙,等.基于ShuffleNetv2-YOLOV3模型的静态手势实时识别方法[J].浙江大学学报(工学版), 2021, 55(10): 1815-1824, 1846. |
XIN W B, HAO H M, BU M L, el at. Static gesture real-time recognition method based on ShuffleNetv2-YOLOV3 model[J]. Journal of Zhejiang University (Engineering Science), 2021, 55(10): 1815-1824, 1846. | |
15 | REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection [C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 779-788. 10.1109/cvpr.2016.91 |
16 | REDMON J, FARHADI A. YOLO9000: better, faster, stronger [C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 6517-6525. 10.1109/cvpr.2017.690 |
17 | REDMON J, FARHADI A. YOLOv3: an incremental improvement[EB/OL]. (2018-04-08) [2022-04-20]. . 10.1109/cvpr.2017.690 |
18 | BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4: optimal speed and accuracy of object detection[EB/OL]. (2020-04-23) [2022-04-20]. . |
19 | LI D, HU J, WANG C H, et al. Involution: inverting the inherence of convolution for visual recognition [C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 12316-12325. 10.1109/cvpr46437.2021.01214 |
20 | JIE H, LI S, GANG S, et al. Squeeze-and-excitation networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(8): 2011-2023. 10.1109/tpami.2019.2913372 |
21 | WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module [C]// Proceedings of the 2018 European Conference on Computer Vision, LNCS 11211. Cham: Springer, 2018: 3-19. |
22 | HOU Q B, ZHOU D Q, FENG J S. Coordinate attention for efficient mobile network design [C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 13708-13717. 10.1109/cvpr46437.2021.01350 |
23 | WANG J Q, CHEN K, XU R, et al. CARAFE: Content-Aware ReAssembly of FEatures [C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 3007-3016. 10.1109/iccv.2019.00310 |
24 | HAN K, WANG Y H, TIAN Q, et al. GhostNet: more features from cheap operations [C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 1577-1586. 10.1109/cvpr42600.2020.00165 |
25 | GLOROT X, BORDES A, BENGIO Y. Deep sparse rectifier neural networks [C]// Proceedings of the 14th International Conference on Artificial Intelligence and Statistics. New York: JMLR.org, 2011: 315-323. |
26 | ELFWING S, UCHIBE E, DOYA K. Sigmoid-weighted linear units for neural network function approximation in reinforcement learning[J]. Neural Networks, 2018, 107: 3-11. 10.1016/j.neunet.2017.12.012 |
[1] | 潘烨新, 杨哲. 基于多级特征双向融合的小目标检测优化模型[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2871-2877. |
[2] | 赵志强, 马培红, 黑新宏. 基于双重注意力机制的人群计数方法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2886-2892. |
[3] | 秦璟, 秦志光, 李发礼, 彭悦恒. 基于概率稀疏自注意力神经网络的重性抑郁疾患诊断[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2970-2974. |
[4] | 李力铤, 华蓓, 贺若舟, 徐况. 基于解耦注意力机制的多变量时序预测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2732-2738. |
[5] | 张英俊, 李牛牛, 谢斌红, 张睿, 陆望东. 课程学习指导下的半监督目标检测框架[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2326-2333. |
[6] | 李烨恒, 罗光圣, 苏前敏. 基于改进YOLOv5的Logo检测算法[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2580-2587. |
[7] | 薛凯鹏, 徐涛, 廖春节. 融合自监督和多层交叉注意力的多模态情感分析网络[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2387-2392. |
[8] | 汪雨晴, 朱广丽, 段文杰, 李书羽, 周若彤. 基于交互注意力机制的心理咨询文本情感分类模型[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2393-2399. |
[9] | 高鹏淇, 黄鹤鸣, 樊永红. 融合坐标与多头注意力机制的交互语音情感识别[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2400-2406. |
[10] | 李钟华, 白云起, 王雪津, 黄雷雷, 林初俊, 廖诗宇. 基于图像增强的低照度人脸检测[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2588-2594. |
[11] | 莫尚斌, 王文君, 董凌, 高盛祥, 余正涛. 基于多路信息聚合协同解码的单通道语音增强[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2611-2617. |
[12] | 熊武, 曹从军, 宋雪芳, 邵云龙, 王旭升. 基于多尺度混合域注意力机制的笔迹鉴别方法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2225-2232. |
[13] | 姬张建, 杜娜. 基于改进VariFocalNet的微小目标检测[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2200-2207. |
[14] | 李欢欢, 黄添强, 丁雪梅, 罗海峰, 黄丽清. 基于多尺度时空图卷积网络的交通出行需求预测[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2065-2072. |
[15] | 毛典辉, 李学博, 刘峻岭, 张登辉, 颜文婧. 基于并行异构图和序列注意力机制的中文实体关系抽取模型[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2018-2025. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||