Lightweight gesture recognition algorithm for basketball referee

doi:10.11772/j.issn.1001-9081.2022060810

Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (7): 2173-2181.DOI: 10.11772/j.issn.1001-9081.2022060810

• Artificial intelligence • Previous Articles Next Articles

Lightweight gesture recognition algorithm for basketball referee

Zhongyu LI¹, Haodong SUN¹, Jiao LI¹^,²^,³()

^1.Microelectronic Research and Development Center，Shanghai University，Shanghai 200444，China
^2.School of Mechatronic Engineering and Automation，Shanghai University，Shanghai 200444，China
^3.Key Laboratory of Advanced Display and System Applications，Ministry of Education （Shanghai University），Shanghai 200444，China

Received:2022-06-06 Revised:2022-09-07 Accepted:2022-09-09 Online:2023-07-20 Published:2023-07-10
Contact: Jiao LI
About author:LI Zhongyu， born in 1997， M. S. candidate. His research interests include object detection.
SUN Haodong， born in 1998， M. S. candidate. His research interests include object detection.
LI Jiao， born in 1975， Ph. D.， lecturer. Her research interests include pattern recognition.
Supported by:
National Natural Science Foundation of China(52107239)

轻量化篮球裁判手势识别算法

李忠雨¹, 孙浩东¹, 李娇¹^,²^,³()

^1.上海大学微电子研究与开发中心, 上海 200444
^2.上海大学机电工程与自动化学院, 上海 200444
^3.上海大学新型显示技术及应用集成教育部重点实验室(上海大学), 上海 200444

通讯作者: 李娇
作者简介:李忠雨（1997—），男，重庆人，硕士研究生，主要研究方向：目标检测；
孙浩东（1998—），男，山西大同人，硕士研究生，主要研究方向：目标检测；
李娇（1975—），女，上海人，讲师，博士，主要研究方向：模式识别。
基金资助:
国家自然科学基金资助项目(52107239)

Abstract

Abstract:

Aiming at the problem that the number of parameters， calculation amount and accuracy of general gesture recognition algorithms are difficult to balance， a lightweight gesture recognition algorithm for basketball referee was proposed. The proposed algorithm was reconstructed on the basis of YOLOV5s （You Only Look Once Version 5s） algorithm： Firstly， the Involution operator was used to replace CSP1_1 （Cross Stage Partial 1_1） convolution operator to expand the context information capturing range and reduce the kernel redundancy. Secondly， the Coordinate Attention （CA） mechanism was added after the C3 module to obtain stronger gesture feature extraction ability. Thirdly， a lightweight content aware upsampling operator was used to improve the original upsampling module， and the sampling points were concentrated in the object area and the background part was ignored. Finally， the Ghost-Net with SiLU （Sigmoid Weighted Liner Unit） as the activation function was used for lightweight pruning. Experimental results on the self-made basketball referee gesture dataset show that the calculation amount， number of parameters and model size of this lightweight gesture recognition algorithm for basketball referee are 3.3 GFLOPs， 4.0×10⁶ and 8.5 MB respectively， which are only 79%， 44% and 40% of those of YOLOV5s algorithm， mAP@0.5 of the proposed algorithm is 91.7%， and the detection frame rate of the proposed algorithm on the game video with a resolution of 1 920×1 280 reaches 89.3 frame/s， verifying that the proposed algorithm can meet the requirements of low error， high detection rate and lightweight.

Key words: object detection, gesture recognition, Involution operator, attention mechanism, upsampling, Ghost-Net

摘要：

针对一般手势识别算法的参数量、计算量和精度难以平衡的问题，提出一种轻量化篮球裁判手势识别算法。该算法在YOLOV5s （You Only Look Once Version 5s）算法的基础上进行重构：首先，用Involution算子替代CSP1_1的卷积算子，以扩大上下文信息捕获范围并减少核冗余；其次，在C3模块后加入协同注意力（CA）机制，以得到更强的手势特征提取能力；然后，用轻量化内容感知上采样算子改进原始上采样模块，并将采样点集中在目标区域而忽略背景部分；最后，利用以SiLU作为激活函数的Ghost-Net进行轻量化剪枝。在自制的篮球裁判手势数据集上的实验结果表明，该轻量化篮球裁判手势识别算法的计算量、参数量和模型大小分别为3.3 GFLOPs、4.0×10⁶和8.5 MB，与YOLOV5s算法相比，分别减少了79%、44%和40%，mAP@0.5为91.7%，在分辨率为1 920×1 280的比赛视频上的检测帧率达到89.3 frame/s，证明该算法能满足低误差、高帧率和轻量化的要求。

关键词: 目标检测, 手势识别, Involution算子, 注意力机制, 上采样, Ghost-Net

CLC Number:

TP391.4

Zhongyu LI, Haodong SUN, Jiao LI. Lightweight gesture recognition algorithm for basketball referee[J]. Journal of Computer Applications, 2023, 43(7): 2173-2181.

李忠雨, 孙浩东, 李娇. 轻量化篮球裁判手势识别算法[J]. 《计算机应用》唯一官方网站, 2023, 43(7): 2173-2181.

Figures/Tables 19

Fig. 1 YOLOV5s network structure

Fig. 2 Principle and operation process of Involution

Fig. 3 Coordinate attention structure

Fig. 4 Principle of unsampling kernel prediction module

Fig. 5 Principle of feature reassembly module

Fig. 6 Ghost Bottleneck module structure

Fig. 7 Structure block diagram of reconstructed detection algorithm

Fig. 8 Common basketball referee gestures

Fig. 9 Statistics of label number distribution of each referee gesture in dataset

Fig. 10 20 sign language gestures in Chalearn gesture dataset

Fig. 11 Statistics of label number distribution of 20 sign language gestures in training set

Tab. 1 Conparison experiment of introducing Involution operator into CSP

Involution引入模块	计算量/GFLOPs	mAP@0.5/%
CSP1_1	5.3	90.6
CSP1_2	6.9	89.8
CSP1_3	16.7	89.1
CSP1_1、CSP1_2	3.0	86.4
CSP1_1、CSP1_2、CSP1_3	3.0	85.2

Tab. 2 Comparison of performance of YOLOV5s before and after improvement

算法	参数量/10⁶	计算量/GFLOPs	模型大小/MB	mAP@0.5/%	精确率/%	召回率/%	帧率/（frame·s^-1）
YOLOV5s	7.1	16.0	14.1	93.5	94.6	88.5	103.1
YOLOV5s-ghost	3.7	8.3	7.8	92.2	95.5	86.9	94.3
YOLOV5s + Involution	7.1	5.3	14.3	90.6	92.1	86.1	120.5
YOLOV5s + CA	7.1	16.1	14.2	94.0	92.9	88.9	84.0
YOLOV5s + Content-aware	7.2	16.5	14.4	93.7	94.4	87.8	94.3
YOLOV5s-ghost + CA+ Involution + Content-aware	4.0	3.3	8.5	91.7	96.8	89.2	89.3

Fig. 12 Detection accuracy of 20 referee gestures by using the proposed algorithm

Tab. 3 Comparison of mAP@0.5 values between stop the clock for foul gesture and offensive foul gesture in three algorithms

算法	手势类型	mAP@0.5
YOLOV5s	犯规停止计时钟	84.3
YOLOV5s	进攻犯规	79.8
YOLOV5s-Involution	犯规停止计时钟	77.6
YOLOV5s-Involution	进攻犯规	73.2
YOLOV5s-ghost+ CA+Involution+ Content-aware	犯规停止计时钟	79.8
YOLOV5s-ghost+ CA+Involution+ Content-aware	进攻犯规	78.1

Tab. 4 Comparison of detection results between YOLOV5s algorithm and the proposed algorithm on 2013 Chalearn gesture dataset

算法	mAP@0.5	精确率	召回率
YOLOV5s	97.2	96.3	89.6
本文算法	95.2	96.8	89.3

Fig. 13 Detection results of the proposed algorithm on 2013 Chalearn gesture dataset

Tab. 5 Comparison experiments between other object detection algorithms and the proposed algorithm

算法	模型大小/MB	计算量/GFLOPs	mAP@0.5/%	帧率/（frame·s^-1）
Faster RCNN	460.4	283.2	82.6	34.5
YOLOV3	235.6	154.9	81.3	66.7
YOLOV4	245.8	142.0	85.4	82.4
YOLOV3-Tiny	34.0	5.6	71.3	76.9
YOLOV4-Tiny	23.5	6.9	78.8	83.3
YOLOX-Tiny	5.1	6.45	86.3	86.9
YOLOV5s	14.1	16.0	93.5	103.1
本文算法	8.5	3.3	91.7	89.3

Fig. 14 Actual detection effect of the proposed referee gesture detection algorithm

References 26

1	张亚吉.篮球比赛智能辅助裁判系统设计［J］.冰雪体育创新研究， 2021（22）： 169-170.
	ZHANG Y J. Design of intelligent assistant referee system for basketball match［J］. Research on Innovation of Ice Snow Sports， 2021（22）： 169-170.
2	曾丽霞，蒋晓，戴传庆.可穿戴设备中手势交互的设计原则［J］.包装工程， 2015， 36（20）： 135-138， 155.
	ZENG L X， JIANG X， DAI C Q. Design principle of gesture interaction in the wearable device［J］. Packaging Engineering， 2015， 36（20）： 135-138， 155.
3	关然，徐向民，罗雅愉，等.基于计算机视觉的手势检测识别技术［J］.计算机应用与软件， 2013， 30（1）： 155-159， 164. 10.3969/j.issn.1000-386x.2013.01.038
	GUAN R， XU X M， LUO Y Y， et al. A computer vision-based gesture detection and recognition technique［J］. Computer Applications and Software， 2013， 30（1）： 155-159， 164. 10.3969/j.issn.1000-386x.2013.01.038
4	GUYON I， ATHITSOS V， JANGYODSUK P， et al. The ChaLearn gesture dataset （CGD 2011）［J］. Machine Vision and Applications， 2014， 25（8）： 1929-1951. 10.1007/s00138-014-0596-3
5	PIGOU L， van HERREWEGHE M， DAMBRE J. Gesture and sign language recognition with temporal residual networks ［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision Workshops. Piscataway： IEEE， 2017： 3086-3093. 10.1109/iccvw.2017.365
6	LI D X， CHEN Y M， GAO M K， et al. Multimodal gesture recognition using densely connected convolution and BLSTM ［C］// Proceedings of the 24th International Conference on Pattern Recognition. Piscataway： IEEE， 2018： 3365-3370. 10.1109/icpr.2018.8545502
7	PAN T Y， TSAI W L， CHANG C Y， et al. A hierarchical hand gesture recognition framework for sports referee training-based EMG and accelerometer sensors［J］. IEEE Transactions on Cybernetics， 2022， 52（5）： 3172-3183. 10.1109/tcyb.2020.3007173
8	ŽEMGULYS J， RAUDONIS V， MASKELIŪNAS R， et al. Recognition of basketball referee signals from real-time videos［J］. Journal of Ambient Intelligence and Humanized Computing， 2020， 11（3）： 979-991. 10.1007/s12652-019-01209-1
9	JI R. Research on basketball shooting action based on image feature extraction and machine learning［J］. IEEE Access， 2020， 8： 138743-138751. 10.1109/access.2020.3012456
10	NGUYEN N H， PHAN T D T， KIM S H， et al. 3D skeletal joints-based hand gesture spotting and classification［J］. Applied Sciences， 2021， 11（10）： No.4689. 10.3390/app11104689
11	郭紫嫣，韩慧妍，何黎刚，等.基于改进的YOLOV4的手势识别算法及其应用［J］.中北大学学报（自然科学版）， 2021， 42（3）： 223-231. 10.3969/j.issn.1673-3193.2021.03.006
	GUO Z Y， HAN H Y， HE L G， et al. Gesture recognition algorithm and application based on improved YOLOv4［J］. Journal of North University of China （Natural Science Edition）， 2021， 42（3）： 223-231. 10.3969/j.issn.1673-3193.2021.03.006
12	解迎刚，王全.基于视觉的动态手势识别研究综述［J］.计算机工程与应用， 2021， 57（22）： 68-77.
	XIE Y G， WANG Q. Summary of dynamic gesture recognition based on vision［J］. Computer Engineering and Applications， 2021， 57（22）： 68-77.
13	田秋红，杨慧敏，梁庆龙，等.视觉动态手势识别综述［J］.浙江理工大学学报（自然科学版）， 2020， 43（4）： 557-569.
	TIAN Q H， YANG H M， LIANG Q L， et al. Overview on vision-based dynamic gesture recognition［J］. Journal of Zhejiang Sci-Tech University （Natural Sciences Edition）， 2020， 43（4）： 557-569.
14	辛文斌，郝惠敏，卜明龙，等.基于ShuffleNetv2-YOLOV3模型的静态手势实时识别方法［J］.浙江大学学报（工学版）， 2021， 55（10）： 1815-1824， 1846.
	XIN W B， HAO H M， BU M L， el at. Static gesture real-time recognition method based on ShuffleNetv2-YOLOV3 model［J］. Journal of Zhejiang University （Engineering Science）， 2021， 55（10）： 1815-1824， 1846.
15	REDMON J， DIVVALA S， GIRSHICK R， et al. You only look once： unified， real-time object detection ［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 779-788. 10.1109/cvpr.2016.91
16	REDMON J， FARHADI A. YOLO9000： better， faster， stronger ［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 6517-6525. 10.1109/cvpr.2017.690
17	REDMON J， FARHADI A. YOLOv3： an incremental improvement［EB/OL］. （2018-04-08）［2022-04-20］. . 10.1109/cvpr.2017.690
18	BOCHKOVSKIY A， WANG C Y， LIAO H Y M. YOLOv4： optimal speed and accuracy of object detection［EB/OL］. （2020-04-23）［2022-04-20］. .
19	LI D， HU J， WANG C H， et al. Involution： inverting the inherence of convolution for visual recognition ［C］// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2021： 12316-12325. 10.1109/cvpr46437.2021.01214
20	JIE H， LI S， GANG S， et al. Squeeze-and-excitation networks［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2020， 42（8）： 2011-2023. 10.1109/tpami.2019.2913372
21	WOO S， PARK J， LEE J Y， et al. CBAM： convolutional block attention module ［C］// Proceedings of the 2018 European Conference on Computer Vision， LNCS 11211. Cham： Springer， 2018： 3-19.
22	HOU Q B， ZHOU D Q， FENG J S. Coordinate attention for efficient mobile network design ［C］// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2021： 13708-13717. 10.1109/cvpr46437.2021.01350
23	WANG J Q， CHEN K， XU R， et al. CARAFE： Content-Aware ReAssembly of FEatures ［C］// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2019： 3007-3016. 10.1109/iccv.2019.00310
24	HAN K， WANG Y H， TIAN Q， et al. GhostNet： more features from cheap operations ［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 1577-1586. 10.1109/cvpr42600.2020.00165
25	GLOROT X， BORDES A， BENGIO Y. Deep sparse rectifier neural networks ［C］// Proceedings of the 14th International Conference on Artificial Intelligence and Statistics. New York： JMLR.org， 2011： 315-323.
26	ELFWING S， UCHIBE E， DOYA K. Sigmoid-weighted linear units for neural network function approximation in reinforcement learning［J］. Neural Networks， 2018， 107： 3-11. 10.1016/j.neunet.2017.12.012

[1]	Yexin PAN, Zhe YANG. Optimization model for small object detection based on multi-level feature bidirectional fusion [J]. Journal of Computer Applications, 2024, 44(9): 2871-2877.
[2]	Zhiqiang ZHAO, Peihong MA, Xinhong HEI. Crowd counting method based on dual attention mechanism [J]. Journal of Computer Applications, 2024, 44(9): 2886-2892.
[3]	Jing QIN, Zhiguang QIN, Fali LI, Yueheng PENG. Diagnosis of major depressive disorder based on probabilistic sparse self-attention neural network [J]. Journal of Computer Applications, 2024, 44(9): 2970-2974.
[4]	Liting LI, Bei HUA, Ruozhou HE, Kuang XU. Multivariate time series prediction model based on decoupled attention mechanism [J]. Journal of Computer Applications, 2024, 44(9): 2732-2738.
[5]	Kaipeng XUE, Tao XU, Chunjie LIAO. Multimodal sentiment analysis network with self-supervision and multi-layer cross attention [J]. Journal of Computer Applications, 2024, 44(8): 2387-2392.
[6]	Pengqi GAO, Heming HUANG, Yonghong FAN. Fusion of coordinate and multi-head attention mechanisms for interactive speech emotion recognition [J]. Journal of Computer Applications, 2024, 44(8): 2400-2406.
[7]	Zhonghua LI, Yunqi BAI, Xuejin WANG, Leilei HUANG, Chujun LIN, Shiyu LIAO. Low illumination face detection based on image enhancement [J]. Journal of Computer Applications, 2024, 44(8): 2588-2594.
[8]	Shangbin MO, Wenjun WANG, Ling DONG, Shengxiang GAO, Zhengtao YU. Single-channel speech enhancement based on multi-channel information aggregation and collaborative decoding [J]. Journal of Computer Applications, 2024, 44(8): 2611-2617.
[9]	Yeheng LI, Guangsheng LUO, Qianmin SU. Logo detection algorithm based on improved YOLOv5 [J]. Journal of Computer Applications, 2024, 44(8): 2580-2587.
[10]	Yingjun ZHANG, Niuniu LI, Binhong XIE, Rui ZHANG, Wangdong LU. Semi-supervised object detection framework guided by curriculum learning [J]. Journal of Computer Applications, 2024, 44(8): 2326-2333.
[11]	Wu XIONG, Congjun CAO, Xuefang SONG, Yunlong SHAO, Xusheng WANG. Handwriting identification method based on multi-scale mixed domain attention mechanism [J]. Journal of Computer Applications, 2024, 44(7): 2225-2232.
[12]	Huanhuan LI, Tianqiang HUANG, Xuemei DING, Haifeng LUO, Liqing HUANG. Public traffic demand prediction based on multi-scale spatial-temporal graph convolutional network [J]. Journal of Computer Applications, 2024, 44(7): 2065-2072.
[13]	Dianhui MAO, Xuebo LI, Junling LIU, Denghui ZHANG, Wenjing YAN. Chinese entity and relation extraction model based on parallel heterogeneous graph and sequential attention mechanism [J]. Journal of Computer Applications, 2024, 44(7): 2018-2025.
[14]	Li LIU, Haijin HOU, Anhong WANG, Tao ZHANG. Generative data hiding algorithm based on multi-scale attention [J]. Journal of Computer Applications, 2024, 44(7): 2102-2109.
[15]	Song XU, Wenbo ZHANG, Yifan WANG. Lightweight video salient object detection network based on spatiotemporal information [J]. Journal of Computer Applications, 2024, 44(7): 2192-2199.

Lightweight gesture recognition algorithm for basketball referee

轻量化篮球裁判手势识别算法

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 19

References 26

Related Articles 15

Recommended Articles

Metrics