轻量化篮球裁判手势识别算法

doi:10.11772/j.issn.1001-9081.2022060810

《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (7): 2173-2181.DOI: 10.11772/j.issn.1001-9081.2022060810

轻量化篮球裁判手势识别算法

李忠雨¹, 孙浩东¹, 李娇¹^,²^,³()

^1.上海大学微电子研究与开发中心, 上海 200444
^2.上海大学机电工程与自动化学院, 上海 200444
^3.上海大学新型显示技术及应用集成教育部重点实验室(上海大学), 上海 200444

收稿日期:2022-06-06 修回日期:2022-09-07 接受日期:2022-09-09 发布日期:2023-07-20 出版日期:2023-07-10
通讯作者: 李娇
作者简介:李忠雨（1997—），男，重庆人，硕士研究生，主要研究方向：目标检测；
孙浩东（1998—），男，山西大同人，硕士研究生，主要研究方向：目标检测；
李娇（1975—），女，上海人，讲师，博士，主要研究方向：模式识别。
基金资助:
国家自然科学基金资助项目(52107239)

Lightweight gesture recognition algorithm for basketball referee

Zhongyu LI¹, Haodong SUN¹, Jiao LI¹^,²^,³()

^1.Microelectronic Research and Development Center，Shanghai University，Shanghai 200444，China
^2.School of Mechatronic Engineering and Automation，Shanghai University，Shanghai 200444，China
^3.Key Laboratory of Advanced Display and System Applications，Ministry of Education （Shanghai University），Shanghai 200444，China

Received:2022-06-06 Revised:2022-09-07 Accepted:2022-09-09 Online:2023-07-20 Published:2023-07-10
Contact: Jiao LI
About author:LI Zhongyu， born in 1997， M. S. candidate. His research interests include object detection.
SUN Haodong， born in 1998， M. S. candidate. His research interests include object detection.
LI Jiao， born in 1975， Ph. D.， lecturer. Her research interests include pattern recognition.
Supported by:
National Natural Science Foundation of China(52107239)

摘要/Abstract

摘要：

针对一般手势识别算法的参数量、计算量和精度难以平衡的问题，提出一种轻量化篮球裁判手势识别算法。该算法在YOLOV5s （You Only Look Once Version 5s）算法的基础上进行重构：首先，用Involution算子替代CSP1_1的卷积算子，以扩大上下文信息捕获范围并减少核冗余；其次，在C3模块后加入协同注意力（CA）机制，以得到更强的手势特征提取能力；然后，用轻量化内容感知上采样算子改进原始上采样模块，并将采样点集中在目标区域而忽略背景部分；最后，利用以SiLU作为激活函数的Ghost-Net进行轻量化剪枝。在自制的篮球裁判手势数据集上的实验结果表明，该轻量化篮球裁判手势识别算法的计算量、参数量和模型大小分别为3.3 GFLOPs、4.0×10⁶和8.5 MB，与YOLOV5s算法相比，分别减少了79%、44%和40%，mAP@0.5为91.7%，在分辨率为1 920×1 280的比赛视频上的检测帧率达到89.3 frame/s，证明该算法能满足低误差、高帧率和轻量化的要求。

关键词: 目标检测, 手势识别, Involution算子, 注意力机制, 上采样, Ghost-Net

Abstract:

Aiming at the problem that the number of parameters， calculation amount and accuracy of general gesture recognition algorithms are difficult to balance， a lightweight gesture recognition algorithm for basketball referee was proposed. The proposed algorithm was reconstructed on the basis of YOLOV5s （You Only Look Once Version 5s） algorithm： Firstly， the Involution operator was used to replace CSP1_1 （Cross Stage Partial 1_1） convolution operator to expand the context information capturing range and reduce the kernel redundancy. Secondly， the Coordinate Attention （CA） mechanism was added after the C3 module to obtain stronger gesture feature extraction ability. Thirdly， a lightweight content aware upsampling operator was used to improve the original upsampling module， and the sampling points were concentrated in the object area and the background part was ignored. Finally， the Ghost-Net with SiLU （Sigmoid Weighted Liner Unit） as the activation function was used for lightweight pruning. Experimental results on the self-made basketball referee gesture dataset show that the calculation amount， number of parameters and model size of this lightweight gesture recognition algorithm for basketball referee are 3.3 GFLOPs， 4.0×10⁶ and 8.5 MB respectively， which are only 79%， 44% and 40% of those of YOLOV5s algorithm， mAP@0.5 of the proposed algorithm is 91.7%， and the detection frame rate of the proposed algorithm on the game video with a resolution of 1 920×1 280 reaches 89.3 frame/s， verifying that the proposed algorithm can meet the requirements of low error， high detection rate and lightweight.

Key words: object detection, gesture recognition, Involution operator, attention mechanism, upsampling, Ghost-Net

中图分类号:

TP391.4

李忠雨, 孙浩东, 李娇. 轻量化篮球裁判手势识别算法[J]. 计算机应用, 2023, 43(7): 2173-2181.

Zhongyu LI, Haodong SUN, Jiao LI. Lightweight gesture recognition algorithm for basketball referee[J]. Journal of Computer Applications, 2023, 43(7): 2173-2181.

图/表 19

图1 YOLOV5s网络结构

Fig. 1 YOLOV5s network structure

图2 Involution原理及运算过程

Fig. 2 Principle and operation process of Involution

图3 协同注意力结构

Fig. 3 Coordinate attention structure

图4 上采样核预测模块原理

Fig. 4 Principle of unsampling kernel prediction module

图5 特征重组模块原理

Fig. 5 Principle of feature reassembly module

图6 Ghost Bottleneck模块结构

Fig. 6 Ghost Bottleneck module structure

图7 重构后的检测算法结构框图

Fig. 7 Structure block diagram of reconstructed detection algorithm

图8 常见的篮球裁判手势

Fig. 8 Common basketball referee gestures

图9 数据集中各裁判手势的标签个数分布统计

Fig. 9 Statistics of label number distribution of each referee gesture in dataset

图10 2013 Chalearn手势数据集中的20种手语手势

Fig. 10 20 sign language gestures in Chalearn gesture dataset

图11 20种手语手势在训练集中的标签个数分布统计

Fig. 11 Statistics of label number distribution of 20 sign language gestures in training set

表1 在CSP中引入Involution算子的对比实验

Tab. 1 Conparison experiment of introducing Involution operator into CSP

Involution引入模块	计算量/GFLOPs	mAP@0.5/%
CSP1_1	5.3	90.6
CSP1_2	6.9	89.8
CSP1_3	16.7	89.1
CSP1_1、CSP1_2	3.0	86.4
CSP1_1、CSP1_2、CSP1_3	3.0	85.2

表2 YOLOV5s改进前后的性能对比

Tab. 2 Comparison of performance of YOLOV5s before and after improvement

算法	参数量/10⁶	计算量/GFLOPs	模型大小/MB	mAP@0.5/%	精确率/%	召回率/%	帧率/（frame·s^-1）
YOLOV5s	7.1	16.0	14.1	93.5	94.6	88.5	103.1
YOLOV5s-ghost	3.7	8.3	7.8	92.2	95.5	86.9	94.3
YOLOV5s + Involution	7.1	5.3	14.3	90.6	92.1	86.1	120.5
YOLOV5s + CA	7.1	16.1	14.2	94.0	92.9	88.9	84.0
YOLOV5s + Content-aware	7.2	16.5	14.4	93.7	94.4	87.8	94.3
YOLOV5s-ghost + CA+ Involution + Content-aware	4.0	3.3	8.5	91.7	96.8	89.2	89.3

图12 本文算法对20种裁判手势的检测精度

Fig. 12 Detection accuracy of 20 referee gestures by using the proposed algorithm

表3 犯规停止计时钟手势和进攻犯规手势在三种算法中的mAP@0.5值对比 ( %)

Tab. 3 Comparison of mAP@0.5 values between stop the clock for foul gesture and offensive foul gesture in three algorithms

算法	手势类型	mAP@0.5
YOLOV5s	犯规停止计时钟	84.3
YOLOV5s	进攻犯规	79.8
YOLOV5s-Involution	犯规停止计时钟	77.6
YOLOV5s-Involution	进攻犯规	73.2
YOLOV5s-ghost+ CA+Involution+ Content-aware	犯规停止计时钟	79.8
YOLOV5s-ghost+ CA+Involution+ Content-aware	进攻犯规	78.1

表4 本文算法与YOLOV5s算法在2013 Chalearn数据集上的检测结果对比 ( %)

Tab. 4 Comparison of detection results between YOLOV5s algorithm and the proposed algorithm on 2013 Chalearn gesture dataset

算法	mAP@0.5	精确率	召回率
YOLOV5s	97.2	96.3	89.6
本文算法	95.2	96.8	89.3

图13 本文算法在2013 Chalearn数据集上的检测结果

Fig. 13 Detection results of the proposed algorithm on 2013 Chalearn gesture dataset

表5 其他目标检测算法与本文算法的对比实验

Tab. 5 Comparison experiments between other object detection algorithms and the proposed algorithm

算法	模型大小/MB	计算量/GFLOPs	mAP@0.5/%	帧率/（frame·s^-1）
Faster RCNN	460.4	283.2	82.6	34.5
YOLOV3	235.6	154.9	81.3	66.7
YOLOV4	245.8	142.0	85.4	82.4
YOLOV3-Tiny	34.0	5.6	71.3	76.9
YOLOV4-Tiny	23.5	6.9	78.8	83.3
YOLOX-Tiny	5.1	6.45	86.3	86.9
YOLOV5s	14.1	16.0	93.5	103.1
本文算法	8.5	3.3	91.7	89.3

图14 本文裁判手势检测算法的实际检测效果

Fig. 14 Actual detection effect of the proposed referee gesture detection algorithm

参考文献 26

1	张亚吉.篮球比赛智能辅助裁判系统设计［J］.冰雪体育创新研究， 2021（22）： 169-170.
	ZHANG Y J. Design of intelligent assistant referee system for basketball match［J］. Research on Innovation of Ice Snow Sports， 2021（22）： 169-170.
2	曾丽霞，蒋晓，戴传庆.可穿戴设备中手势交互的设计原则［J］.包装工程， 2015， 36（20）： 135-138， 155.
	ZENG L X， JIANG X， DAI C Q. Design principle of gesture interaction in the wearable device［J］. Packaging Engineering， 2015， 36（20）： 135-138， 155.
3	关然，徐向民，罗雅愉，等.基于计算机视觉的手势检测识别技术［J］.计算机应用与软件， 2013， 30（1）： 155-159， 164. 10.3969/j.issn.1000-386x.2013.01.038
	GUAN R， XU X M， LUO Y Y， et al. A computer vision-based gesture detection and recognition technique［J］. Computer Applications and Software， 2013， 30（1）： 155-159， 164. 10.3969/j.issn.1000-386x.2013.01.038
4	GUYON I， ATHITSOS V， JANGYODSUK P， et al. The ChaLearn gesture dataset （CGD 2011）［J］. Machine Vision and Applications， 2014， 25（8）： 1929-1951. 10.1007/s00138-014-0596-3
5	PIGOU L， van HERREWEGHE M， DAMBRE J. Gesture and sign language recognition with temporal residual networks ［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision Workshops. Piscataway： IEEE， 2017： 3086-3093. 10.1109/iccvw.2017.365
6	LI D X， CHEN Y M， GAO M K， et al. Multimodal gesture recognition using densely connected convolution and BLSTM ［C］// Proceedings of the 24th International Conference on Pattern Recognition. Piscataway： IEEE， 2018： 3365-3370. 10.1109/icpr.2018.8545502
7	PAN T Y， TSAI W L， CHANG C Y， et al. A hierarchical hand gesture recognition framework for sports referee training-based EMG and accelerometer sensors［J］. IEEE Transactions on Cybernetics， 2022， 52（5）： 3172-3183. 10.1109/tcyb.2020.3007173
8	ŽEMGULYS J， RAUDONIS V， MASKELIŪNAS R， et al. Recognition of basketball referee signals from real-time videos［J］. Journal of Ambient Intelligence and Humanized Computing， 2020， 11（3）： 979-991. 10.1007/s12652-019-01209-1
9	JI R. Research on basketball shooting action based on image feature extraction and machine learning［J］. IEEE Access， 2020， 8： 138743-138751. 10.1109/access.2020.3012456
10	NGUYEN N H， PHAN T D T， KIM S H， et al. 3D skeletal joints-based hand gesture spotting and classification［J］. Applied Sciences， 2021， 11（10）： No.4689. 10.3390/app11104689
11	郭紫嫣，韩慧妍，何黎刚，等.基于改进的YOLOV4的手势识别算法及其应用［J］.中北大学学报（自然科学版）， 2021， 42（3）： 223-231. 10.3969/j.issn.1673-3193.2021.03.006
	GUO Z Y， HAN H Y， HE L G， et al. Gesture recognition algorithm and application based on improved YOLOv4［J］. Journal of North University of China （Natural Science Edition）， 2021， 42（3）： 223-231. 10.3969/j.issn.1673-3193.2021.03.006
12	解迎刚，王全.基于视觉的动态手势识别研究综述［J］.计算机工程与应用， 2021， 57（22）： 68-77.
	XIE Y G， WANG Q. Summary of dynamic gesture recognition based on vision［J］. Computer Engineering and Applications， 2021， 57（22）： 68-77.
13	田秋红，杨慧敏，梁庆龙，等.视觉动态手势识别综述［J］.浙江理工大学学报（自然科学版）， 2020， 43（4）： 557-569.
	TIAN Q H， YANG H M， LIANG Q L， et al. Overview on vision-based dynamic gesture recognition［J］. Journal of Zhejiang Sci-Tech University （Natural Sciences Edition）， 2020， 43（4）： 557-569.
14	辛文斌，郝惠敏，卜明龙，等.基于ShuffleNetv2-YOLOV3模型的静态手势实时识别方法［J］.浙江大学学报（工学版）， 2021， 55（10）： 1815-1824， 1846.
	XIN W B， HAO H M， BU M L， el at. Static gesture real-time recognition method based on ShuffleNetv2-YOLOV3 model［J］. Journal of Zhejiang University （Engineering Science）， 2021， 55（10）： 1815-1824， 1846.
15	REDMON J， DIVVALA S， GIRSHICK R， et al. You only look once： unified， real-time object detection ［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 779-788. 10.1109/cvpr.2016.91
16	REDMON J， FARHADI A. YOLO9000： better， faster， stronger ［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 6517-6525. 10.1109/cvpr.2017.690
17	REDMON J， FARHADI A. YOLOv3： an incremental improvement［EB/OL］. （2018-04-08）［2022-04-20］. . 10.1109/cvpr.2017.690
18	BOCHKOVSKIY A， WANG C Y， LIAO H Y M. YOLOv4： optimal speed and accuracy of object detection［EB/OL］. （2020-04-23）［2022-04-20］. .
19	LI D， HU J， WANG C H， et al. Involution： inverting the inherence of convolution for visual recognition ［C］// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2021： 12316-12325. 10.1109/cvpr46437.2021.01214
20	JIE H， LI S， GANG S， et al. Squeeze-and-excitation networks［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2020， 42（8）： 2011-2023. 10.1109/tpami.2019.2913372
21	WOO S， PARK J， LEE J Y， et al. CBAM： convolutional block attention module ［C］// Proceedings of the 2018 European Conference on Computer Vision， LNCS 11211. Cham： Springer， 2018： 3-19.
22	HOU Q B， ZHOU D Q， FENG J S. Coordinate attention for efficient mobile network design ［C］// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2021： 13708-13717. 10.1109/cvpr46437.2021.01350
23	WANG J Q， CHEN K， XU R， et al. CARAFE： Content-Aware ReAssembly of FEatures ［C］// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2019： 3007-3016. 10.1109/iccv.2019.00310
24	HAN K， WANG Y H， TIAN Q， et al. GhostNet： more features from cheap operations ［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 1577-1586. 10.1109/cvpr42600.2020.00165
25	GLOROT X， BORDES A， BENGIO Y. Deep sparse rectifier neural networks ［C］// Proceedings of the 14th International Conference on Artificial Intelligence and Statistics. New York： JMLR.org， 2011： 315-323.
26	ELFWING S， UCHIBE E， DOYA K. Sigmoid-weighted linear units for neural network function approximation in reinforcement learning［J］. Neural Networks， 2018， 107： 3-11. 10.1016/j.neunet.2017.12.012

[1]	潘烨新, 杨哲. 基于多级特征双向融合的小目标检测优化模型[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2871-2877.
[2]	赵志强, 马培红, 黑新宏. 基于双重注意力机制的人群计数方法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2886-2892.
[3]	秦璟, 秦志光, 李发礼, 彭悦恒. 基于概率稀疏自注意力神经网络的重性抑郁疾患诊断[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2970-2974.
[4]	李力铤, 华蓓, 贺若舟, 徐况. 基于解耦注意力机制的多变量时序预测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2732-2738.
[5]	张英俊, 李牛牛, 谢斌红, 张睿, 陆望东. 课程学习指导下的半监督目标检测框架[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2326-2333.
[6]	李烨恒, 罗光圣, 苏前敏. 基于改进YOLOv5的Logo检测算法[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2580-2587.
[7]	薛凯鹏, 徐涛, 廖春节. 融合自监督和多层交叉注意力的多模态情感分析网络[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2387-2392.
[8]	汪雨晴, 朱广丽, 段文杰, 李书羽, 周若彤. 基于交互注意力机制的心理咨询文本情感分类模型[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2393-2399.
[9]	高鹏淇, 黄鹤鸣, 樊永红. 融合坐标与多头注意力机制的交互语音情感识别[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2400-2406.
[10]	李钟华, 白云起, 王雪津, 黄雷雷, 林初俊, 廖诗宇. 基于图像增强的低照度人脸检测[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2588-2594.
[11]	莫尚斌, 王文君, 董凌, 高盛祥, 余正涛. 基于多路信息聚合协同解码的单通道语音增强[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2611-2617.
[12]	熊武, 曹从军, 宋雪芳, 邵云龙, 王旭升. 基于多尺度混合域注意力机制的笔迹鉴别方法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2225-2232.
[13]	姬张建, 杜娜. 基于改进VariFocalNet的微小目标检测[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2200-2207.
[14]	李欢欢, 黄添强, 丁雪梅, 罗海峰, 黄丽清. 基于多尺度时空图卷积网络的交通出行需求预测[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2065-2072.
[15]	毛典辉, 李学博, 刘峻岭, 张登辉, 颜文婧. 基于并行异构图和序列注意力机制的中文实体关系抽取模型[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2018-2025.

轻量化篮球裁判手势识别算法

Lightweight gesture recognition algorithm for basketball referee

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 19

参考文献 26

相关文章 15

编辑推荐

Metrics