Detection algorithm for tennis ball and curtain object in tennis scene

doi:10.11772/j.issn.1001-9081.2025020194

Journal of Computer Applications

Received:2025-02-28 Revised:2025-04-17 Online:2025-04-29 Published:2025-04-29
Supported by:
Jiangxi Province Annual Graduate Innovation Special Fund Project

网球训练场景中的网球和幕布目标检测算法

胡圣康¹,梁光华¹,杨贞²,姚晨³,李敏¹

1. 江西科技师范大学
2. 江西科技师范大学通信与电子学院
3. 公安部第三研究所

通讯作者: 胡圣康
基金资助:
江西省年度研究生创新专项资金项目

Abstract

Abstract: Abstract: To address the low accuracy issue in tennis racket and curtain target detection within training hall scenarios, HSK-YOLOv8 was proposed as an effective algorithm that enhanced detection accuracy by expanding the feature extraction scope and optimizing path strategies. Firstly, the Mixed Local Channel Attention (MLCA) mechanism was integrated into the C2f module to achieve local-region feature fusion and cross-regional information interaction, thereby improving high-quality feature extraction capabilities and effectively resolving multi-layer feature fusion challenges. Secondly, the conventional Upsample module was replaced with Content-Aware Reassembly of Features (CARAFE), which enhanced image resolution through its advanced content-aware feature reorganization strategy, enabling finer-grained feature capture. Thirdly, convolutional layers with output channels ≥256 were substituted with the Adaptive Downsampling Module (ADown) from YOLOv9. This module mitigated gradient reduction issues through selective gradient descent in parallel network channels while improving detection precision. Finally, the Pyramid 2 Layer (P2) from YOLOv8 was implemented to provide additional deep supervision for the detection algorithm, optimizing the deep neural network architecture and enhancing training efficiency. Experimental results demonstrated that compared with the baseline model YOLOv8n, HSK-YOLOv8 achieved a 4.5 improvement in GFLOPs, a 1.5% increase in recall rate, a 1.2% enhancement in mAP50, and a 1.1% elevation in mAP50-95. The proposed model also showed superior accuracy compared with other mainstream detection models.

Key words: Keywords: YOLOv8, mixed local channel attention, content sensing feature recombination, adaptive subsampling module

摘要： 摘要: 针对训练馆场景中网球和幕布目标检测中精度低问题，提出通过扩大特征提取的范围和优化路径策略来提高目标检测高准确性的有效算法HSK-YOLOv8。首先，在C2f模块中引用混合局部通道注意力(Mixed local channel attention, MLCA)，实现局部区域内特征信息融合和跨区域信息交叉，以提高特征高质量信息提取功能，从而有效处理多层特征的融合问题。其次，用内容感知特征重组(Content-Aware Reassembly of Features, CARAFE)代替Upsample模块，CARAFE模块基于其先进的内容感知特征重组策略，进一步提升了图像分辨率，便于更细粒度的特征捕捉。然后，用YOLOv9中的适应性下采样模块(Adaptive Downsampling Module, ADown)代替输出通道数为256及以上的卷积层(Convolutional layer, Conv)，ADown模块通过在网络的并行通道中实现先进的选择性梯度下降，降低了梯度减少的问题，并提高了检测精度。最后，调用YOLOv8的金字塔2层(Pyramid 2 Layer, P2)为检测算法提供了额外的深度监督，以此优化深度神经网络的结构并提升训练效率。实验结果显示，HSK-YOLOv8相比基准模型YOLOv8n在GFLOPs上提高了4.5，R值增加了1.5%，mAP50提高了1.2%，mAP50-95提升了1.1%，精度与其他主流模型相比也有所提升。

关键词: 关键词: YOLOv8, 混合局部通道注意力, 内容感知特征重组, 适应性下采样模块

CLC Number:

TP391.4

胡圣康梁光华杨贞姚晨李敏. 网球训练场景中的网球和幕布目标检测算法[J]. 《计算机应用》唯一官方网站, DOI: 10.11772/j.issn.1001-9081.2025020194.

[1]	WANG Xin, AN Junxiu, MAO Ke. Image captioning with block-prototype contrastive alignment based on dynamic semantic mapping [J]. Journal of Computer Applications, 0, (): 0-0.
[2]	. Scene recognition method based on structured co-occurrence representation learning [J]. Journal of Computer Applications, 0, (): 0-0.
[3]	. Attention-guided symmetric positive definite second-order representation for facial expression recognition [J]. Journal of Computer Applications, 0, (): 0-0.
[4]	CHEN Xiaolei, AN Qianqian. Salient object detection-driven viewport prediction for 360-degree live video streaming [J]. Journal of Computer Applications, 0, (): 0-0.
[5]	. Red kidney bean leaf disease detection method based on Mamba feature extraction and improved YOLOv11 [J]. Journal of Computer Applications, 0, (): 0-0.
[6]	. Noninvasive fetal electrocardiogram signal extraction method based on Mamba-UNETR [J]. Journal of Computer Applications, 0, (): 0-0.
[7]	. Multimodal bio-coupling correlation driven audio-visual deepfake detection [J]. Journal of Computer Applications, 0, (): 0-0.
[8]	. UAV remote sensing image small object detection algorithm based on improved RT-DETR [J]. Journal of Computer Applications, 0, (): 0-0.
[9]	. Traffic prediction based on spatio-temporal bottleneck attention enhanced by pre-trained language model [J]. Journal of Computer Applications, 0, (): 0-0.
[10]	. Collaborative perception method based on closed-loop trajectory sharing [J]. Journal of Computer Applications, 0, (): 0-0.
[11]	Wenchao MING, Suzhen LIN, Zanxia JIN. Multi-band image captioning method based on scene concept-guided feature fusion [J]. Journal of Computer Applications, 2026, 46(5): 1560-1567.
[12]	Chi ZHANG, Xianjing MENG, Changhao DOU, Qian WANG, Leilei GENG, Xiaoming XI. MD-FVR： cascaded finger vein recognition network based on multi-domain feature fusion [J]. Journal of Computer Applications, 2026, 46(5): 1658-1666.
[13]	Wen PENG, Bokai ZHANG, Jinwei LIN. Chromosome cascaded classification framework integrating image texture enhancement and super-resolution [J]. Journal of Computer Applications, 2026, 46(5): 1647-1657.
[14]	Miaomiao YUAN, Yihong CHU, Guanjun YIN, Chunhua DENG. High-precision recognition method for imperfect grain images based on TransNeXt [J]. Journal of Computer Applications, 2026, 46(5): 1684-1691.
[15]	Binhong XIE, Erdan ZHU, Rui ZHANG. Appearance-motion collaborative modeling for video anomaly detection [J]. Journal of Computer Applications, 2026, 46(5): 1551-1559.