Journal of Computer Applications ›› 2026, Vol. 46 ›› Issue (2): 555-563.DOI: 10.11772/j.issn.1001-9081.2025020158

• Multimedia computing and computer simulation • Previous Articles    

Action quality assessment model based on trajectory-guided perceptual learning with X3D

Sizhong ZHANG, Jianyang LIU(), Linfeng LI   

  1. School of Mechanical Engineering,Southwest Jiaotong University,Chengdu Sichuan 610031,China
  • Received:2025-02-21 Revised:2025-04-15 Accepted:2025-04-17 Online:2025-04-24 Published:2026-02-10
  • Contact: Jianyang LIU
  • About author:ZHANG Sizhong, born in 2000, M. S. candidate. His research interests include computer vision based action quality assessment.
    LIU Jianyang, born in 1985, Ph. D., lecturer. His research interests include computer vision, intelligent robot. Email:manchest@swjtu.edu.cn
    LI Linfeng, born in 1998, M. S. candidate. His research interests include computer vision based lesion detection.
  • Supported by:
    Project of Chengdu Municipal Science and Technology Bureau(2022-YF05-00379-SN)

基于X3D的轨迹引导感知学习的动作质量评估模型

张四中, 刘建阳(), 李林峰   

  1. 西南交通大学 机械工程学院,成都 610031
  • 通讯作者: 刘建阳
  • 作者简介:张四中(2000—),男,河南周口人,硕士研究生,CCF会员,主要研究方向:基于计算机视觉的动作质量评估
    刘建阳(1985—),男,湖南衡阳人,讲师,博士,CCF会员,主要研究方向:计算机视觉、智能机器人 Email:manchest@swjtu.edu.cn
    李林峰(1998—),男,四川资阳人,硕士研究生,CCF会员,主要研究方向:基于计算机视觉的病灶检测。

Abstract:

Action Quality Assessment (AQA) has attracted many researchers as a challenging visual task. Current research methods mainly focus on improving the feature extraction capability of backbone networks, ignoring the impact of motion trajectories. However, the consistency of the movements is also an important factor for evaluating execution of the movements in the real world. Firstly, in order to realize the interactive learning between different information, an AQA model with trajectory-guided perceptual learning was proposed by introducing trajectory information, which utilized trajectory descriptors to guide the model to learn information of the consistency of movements perceptually. Secondly, in order to solve the lack of trajectory labels in the current datasets, an unsupervised optical flow trajectory extraction method based on Farneback optical flow method was designed to obtain movement trajectory information, and the acquired optical flow trajectory features were used as cue words to guide the model to learn the video features perceptually. Finally, learnable spline curves of KAN (Kolmogorov-Arnold Network) were used to fit the data distribution of the mixed features, so as to establish a more accurate mapping relationship. The proposed model was evaluated experimentally on the MTL-AQA, AQA-7, FineDiving, and JIGSAWS datasets using Spearman rank Correlation (Sp.Corr) as the evaluation metric. The results show that the proposed model has the Sp.Corr of 0.910 1, 0.912 0, 0.882 0, and 0.990 0, respectively, which is 0.4%, 12.6%, 6.2%, and 57.1% higher than that of USDL (Uncertainty-aware Score Distribution Learning) model, respectively.

Key words: trajectory, attention mechanism, guided perceptual learning, consistency of movements, Action Quality Assessment (AQA)

摘要:

动作质量评估(AQA)作为一项极具挑战性的视觉任务吸引了众多研究者的目光。当前研究方法主要集中于提升骨干网络的特征提取能力,忽略了运动轨迹的影响。然而,在现实中,动作的连贯性也是评价动作执行情况的重要因素。首先,通过引入轨迹信息,设计一种轨迹引导感知学习的AQA模型,以利用轨迹描述符引导模型感知学习动作连贯性信息,从而实现不同信息之间的交互学习;其次,针对当前数据集缺乏轨迹标签的问题,设计一种基于Farneback光流法的无监督光流轨迹提取方法获取运动轨迹信息,并将获取的光流轨迹特征作为引导词来引导模型感知学习视频特征;最后,利用KAN(Kolmogorov-Arnold Network)的可学习样条曲线拟合混合特征的数据分布,从而建立更精确的映射关系。所提模型在MTL-AQA、AQA-7、FineDiving和JIGSAWS数据集上进行实验,以斯皮尔曼秩相关系数(Sp.Corr)作为评价指标。结果表明,所提模型分别取得了0.910 1、0.912 0、0.882 0和0.990 0的Sp.Corr,与USDL(Uncertainty-aware Score Distribution Learning)模型相比分别提升了0.4%、12.6%、6.2%和57.1%。

关键词: 轨迹, 注意力机制, 引导感知学习, 动作连贯性, 动作质量评估

CLC Number: