Action quality assessment model based on new trajectory-guided perceptual learning with X3D

doi:10.11772/j.issn.1001-9081.2025020158

Abstract

Abstract: Action quality assessment has attracted many researchers as a challenging visual task. Current research methods mainly focus on improving the feature extraction capability of backbone networks, ignoring the effect of motion trajectories. However, the consistency of the movement is also an important factor for evaluating the execution of the movement in real competitions. In this paper we design a novel trajectory-guided perceptual learning framework for action quality assessment, which uses trajectory descriptors to guide the model to perceptually learn action coherence information, and to achieve interactive learning between different information. Aiming at the lack of trajectory labels in the current dataset, we design an unsupervised optical flow trajectory extraction method based on the Farneback optical flow method to obtain motion trajectory information, and use the acquired optical flow trajectory features as cue words to guide the model to learn the video features perceptually. Finally a learnable spline curves of the KAN network is used to fit the data distribution of the mixture of features, so as to establish a more accurate mapping relationship. We conduct experiments on MTL-AQA, AQA-7, FineDiving and JIGSAWS datasets, and the spearman rank correlation coefficients on the above datasets can reach 0.9101 (MTL-AQA), 0.912 (AQA-7), 0.882 (FineDiving), and 0.98 (JIGSAWS), respectively.

Key words: trajectory, attention mechanism, guided perception learning, consistency of action, action quality assessment

摘要： 动作质量评估作为一项极具挑战性的视觉任务吸引了众多研究者的目光。当前研究方法主要集中于提升骨干网络的特征提取能力，忽略了运动轨迹的影响，而现实中动作的连贯性也是评价动作执行情况的重要因素。本文通过引入轨迹信息，设计一种新型轨迹引导感知学习的动作质量评估模型，利用轨迹描述符引导模型感知学习动作连贯性信息，实现不同信息之间的交互学习。针对当前数据集缺乏轨迹标签的问题，设计一种基于Farneback光流法的无监督光流轨迹提取方法获取运动轨迹信息，并将获取的光流轨迹特征作为引导词，引导模型感知学习视频特征。最后利用KAN网络的可学习样条曲线拟合了混合特征的数据分布，建立更精确的映射关系。模型在MTL-AQA、AQA-7、FineDiving、JIGSAWS数据集上进行实验，以斯皮尔曼秩相关系数作为评价指标，在各个数据集上分别取得了0.9101、0.9120、0.882和0.99的效果。

关键词: 轨迹, 注意力机制, 引导感知学习, 动作连贯性, 动作质量评估

CLC Number:

TP391.41

张四中刘建阳李林峰. 基于X3D的新型轨迹引导感知学习的动作质量评估模型[J]. 《计算机应用》唯一官方网站, DOI: 10.11772/j.issn.1001-9081.2025020158.

[1]	Yilin DENG, Fajiang YU. Pseudo random number generator based on LSTM and separable self-attention mechanism [J]. Journal of Computer Applications, 2025, 45(9): 2893-2901.
[2]	Weigang LI, Jiale SHAO, Zhiqiang TIAN. Point cloud classification and segmentation network based on dual attention mechanism and multi-scale fusion [J]. Journal of Computer Applications, 2025, 45(9): 3003-3010.
[3]	Xiang WANG, Zhixiang CHEN, Guojun MAO. Multivariate time series prediction method combining local and global correlation [J]. Journal of Computer Applications, 2025, 45(9): 2806-2816.
[4]	Songjian GU, Fuxiang WU, Xiangyang GAO, Mengjie YANG, Yibing ZHAN, Jun CHENG. Trajectory tracking algorithm for mobile robots based on geometric model predictive control [J]. Journal of Computer Applications, 2025, 45(9): 3026-3035.
[5]	Jinggang LYU, Shaorui PENG, Shuo GAO, Jin ZHOU. Speech enhancement network driven by complex frequency attention and multi-scale frequency enhancement [J]. Journal of Computer Applications, 2025, 45(9): 2957-2965.
[6]	Jin ZHOU, Yuzhi LI, Xu ZHANG, Shuo GAO, Li ZHANG, Jiachuan SHENG. Modulation recognition network for complex electromagnetic environments [J]. Journal of Computer Applications, 2025, 45(8): 2672-2682.
[7]	Haifeng WU, Liqing TAO, Yusheng CHENG. Partial label regression algorithm integrating feature attention and residual connection [J]. Journal of Computer Applications, 2025, 45(8): 2530-2536.
[8]	Chao JING, Yutao QUAN, Yan CHEN. Improved multi-layer perceptron and attention model-based power consumption prediction algorithm [J]. Journal of Computer Applications, 2025, 45(8): 2646-2655.
[9]	Jinhao LIN, Chuan LUO, Tianrui LI, Hongmei CHEN. Thoracic disease classification method based on cross-scale attention network [J]. Journal of Computer Applications, 2025, 45(8): 2712-2719.
[10]	Chen LIANG, Yisen WANG, Qiang WEI, Jiang DU. Source code vulnerability detection method based on Transformer-GCN [J]. Journal of Computer Applications, 2025, 45(7): 2296-2303.
[11]	Yihan WANG, Chong LU, Zhongyuan CHEN. Multimodal sentiment analysis model with cross-modal text information enhancement [J]. Journal of Computer Applications, 2025, 45(7): 2237-2244.
[12]	Shu LI, Guoqing LIU, Siyuan LI, Yaochang QIN. Fast and fully autonomous exploration method for multi-UAV in large-scale complex environments [J]. Journal of Computer Applications, 2025, 45(7): 2317-2324.
[13]	Haoyu LIU, Pengwei KONG, Yaoli WANG, Qing CHANG. Pedestrian detection algorithm based on multi-view information [J]. Journal of Computer Applications, 2025, 45(7): 2325-2332.
[14]	Xiaoqiang ZHAO, Yongyong LIU, Yongyong HUI, Kai LIU. Batch process quality prediction model using improved time-domain convolutional network with multi-head self-attention mechanism [J]. Journal of Computer Applications, 2025, 45(7): 2245-2252.
[15]	Huibin WANG, Zhan’ao HU, Jie HU, Yuanwei XU, Bo WEN. Time series forecasting model based on segmented attention mechanism [J]. Journal of Computer Applications, 2025, 45(7): 2262-2268.

Action quality assessment model based on new trajectory-guided perceptual learning with X3D

基于X3D的新型轨迹引导感知学习的动作质量评估模型

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics