《计算机应用》唯一官方网站 ›› 2025, Vol. 45 ›› Issue (9): 3036-3044.DOI: 10.11772/j.issn.1001-9081.2024091304
• 前沿与综合应用 • 上一篇
石超1, 周昱昕1, 扶倩1, 唐万宇1, 何凌1, 李元媛2()
收稿日期:
2024-09-14
修回日期:
2025-01-15
接受日期:
2025-01-24
发布日期:
2025-03-24
出版日期:
2025-09-10
通讯作者:
李元媛
作者简介:
石超(1997—),男,贵州铜仁人,硕士研究生,主要研究方向:图像处理基金资助:
Chao SHI1, Yuxin ZHOU1, Qian FU1, Wanyu TANG1, Ling HE1, Yuanyuan LI2()
Received:
2024-09-14
Revised:
2025-01-15
Accepted:
2025-01-24
Online:
2025-03-24
Published:
2025-09-10
Contact:
Yuanyuan LI
About author:
SHI Chao, born in 1997, M. S. candidate. His research interests include image processing.Supported by:
摘要:
注意缺陷多动障碍(ADHD)是一种常见于儿童期的神经发育障碍,以注意力不集中、多动和冲动为主要特征,常表现出特定的动作模式。传统的动作识别算法在处理这些特定动作时存在识别准确率低和响应慢等问题。为解决这些问题,提出基于骨架和3D热图的注意缺陷多动障碍患者动作识别算法,并通过高斯分布精确地表示关节点间的空间关系,以有效地保留时空信息。针对单一模态数据的限制,引入基于骨架和3D热图的多模态集成方法。同时,通过融合Short 3D-CNN(3D Convolutional Neural Network)和自适应图卷积网络(AGCN)的输出特征,充分利用两种模态数据的优势,从而提升动作识别性能。在四川大学华西医院心理卫生中心采集的ADHD患者数据集上的实验结果表明,对于8种不同类型的动作,所提算法的Top-1识别准确率为0.860 4,Top-5识别准确率为0.987 3。此外,提出基于动作类型的ADHD自动分型算法,该算法将ADHD分型为头面部体动型、躯干体动型和四肢体动型,它的识别准确率为75%,响应时间为5 s。与2s-AGCN(two-stream AGCN)和PoseConv3D相比,所提算法在复杂动作场景下具有更高的识别精度,为ADHD的个性化干预提供了新的技术手段。
中图分类号:
石超, 周昱昕, 扶倩, 唐万宇, 何凌, 李元媛. 基于骨架和3D热图的注意缺陷多动障碍患者动作识别算法[J]. 计算机应用, 2025, 45(9): 3036-3044.
Chao SHI, Yuxin ZHOU, Qian FU, Wanyu TANG, Ling HE, Yuanyuan LI. Action recognition algorithm for ADHD patients using skeleton and 3D heatmap[J]. Journal of Computer Applications, 2025, 45(9): 3036-3044.
模块/步骤 | 输入维度 | 输出维度 | 说明 |
---|---|---|---|
视频输入 | T=60为帧数,H=W=224、C=3分别为帧的高度、宽度和通道数 | ||
2D骨架提取 | P=17/25/133为关键点数量 | ||
3D热图构建 | K=17/25/133,即每个关节对应一个高斯热图 | ||
3D-CNN特征提取 | 提取视频的时空特征,D1=256为特征维度 | ||
GCN特征提取 | 提取骨架序列的时空特征,D2=256为特征维度 | ||
MLFF-1特征融合 | 对3D-CNN特征和GCN特征进行加权融合,其中,D3=256 | ||
MLFF-2特征融合 | 对3D-CNN特征和GCN特征进行拼接,其中,D4=256 | ||
MLFF-3特征融合 | D5=256为将特征进行拼接并使用Transformer进行进一步融合 | ||
分类输出 | 8 | 输出ADHD动作类别 |
表1 各模块输入输出维度变化
Tab. 1 Input and output dimension changes for each module
模块/步骤 | 输入维度 | 输出维度 | 说明 |
---|---|---|---|
视频输入 | T=60为帧数,H=W=224、C=3分别为帧的高度、宽度和通道数 | ||
2D骨架提取 | P=17/25/133为关键点数量 | ||
3D热图构建 | K=17/25/133,即每个关节对应一个高斯热图 | ||
3D-CNN特征提取 | 提取视频的时空特征,D1=256为特征维度 | ||
GCN特征提取 | 提取骨架序列的时空特征,D2=256为特征维度 | ||
MLFF-1特征融合 | 对3D-CNN特征和GCN特征进行加权融合,其中,D3=256 | ||
MLFF-2特征融合 | 对3D-CNN特征和GCN特征进行拼接,其中,D4=256 | ||
MLFF-3特征融合 | D5=256为将特征进行拼接并使用Transformer进行进一步融合 | ||
分类输出 | 8 | 输出ADHD动作类别 |
MLFF | Top-1准确率 | Top-5准确率 |
---|---|---|
MLFF-1(1∶1) | 0.856 0 | 0.987 9 |
MLFF-1(2∶1) | 0.859 8 | 0.986 7 |
MLFF-1(1∶2) | 0.838 8 | 0.987 8 |
MLFF-2 | 0.854 1 | 0.987 9 |
MLFF-3 | 0.860 4 | 0.987 3 |
表2 3D-GCN在不同融合策略下的ADHD患者动作类型识别准确率
Tab. 2 Action type recognition accuracy of 3D-GCN for ADHD patients under different fusion strategies
MLFF | Top-1准确率 | Top-5准确率 |
---|---|---|
MLFF-1(1∶1) | 0.856 0 | 0.987 9 |
MLFF-1(2∶1) | 0.859 8 | 0.986 7 |
MLFF-1(1∶2) | 0.838 8 | 0.987 8 |
MLFF-2 | 0.854 1 | 0.987 9 |
MLFF-3 | 0.860 4 | 0.987 3 |
算法 | Top-1准确率 | Top-5准确率 | 参数量/106 |
---|---|---|---|
ST-GCN[ | 0.811 5 | 0.994 8 | 3.10 |
MS-G3D[ | 0.842 8 | 0.989 2 | 14.28 |
CTR-GCN[ | 0.842 6 | 0.986 7 | 1.95 |
AGCN[ | 0.832 5 | 0.980 3 | 2.80 |
ST-GCN++[ | 0.814 7 | 0.984 8 | 1.40 |
2s-AGCN[ | 0.843 1 | 0.991 8 | 3.50 |
PoseConv3D[ | 0.847 1 | 0.991 1 | 2.00 |
3D-GCN | 0.860 4 | 0.987 3 | 2.46 |
表3 不同深度学习算法ADHD患者动作类型识别性能
Tab. 3 Action type recognition performance for ADHD patients of different deep learning algorithms
算法 | Top-1准确率 | Top-5准确率 | 参数量/106 |
---|---|---|---|
ST-GCN[ | 0.811 5 | 0.994 8 | 3.10 |
MS-G3D[ | 0.842 8 | 0.989 2 | 14.28 |
CTR-GCN[ | 0.842 6 | 0.986 7 | 1.95 |
AGCN[ | 0.832 5 | 0.980 3 | 2.80 |
ST-GCN++[ | 0.814 7 | 0.984 8 | 1.40 |
2s-AGCN[ | 0.843 1 | 0.991 8 | 3.50 |
PoseConv3D[ | 0.847 1 | 0.991 1 | 2.00 |
3D-GCN | 0.860 4 | 0.987 3 | 2.46 |
算法 | Top-1准确率 | Top-5准确率 | 参数量/106 |
---|---|---|---|
ST-GCN[ | 0.889 5 | 0.987 8 | 3.10 |
CTR-GCN[ | 0.896 0 | 0.989 3 | 1.95 |
MS-G3D[ | 0.913 0 | 0.993 8 | 14.28 |
ST-GCN++[ | 0.892 6 | 0.984 8 | 1.39 |
AGCN[ | 0.886 0 | 0.985 1 | 3.50 |
2s-AGCN[ | 0.919 5 | 0.992 6 | 3.50 |
PoseConv3D[ | 0.934 7 | 0.995 4 | 2.00 |
3D-GCN | 0.942 4 | 0.989 1 | 2.46 |
表4 不同深度学习算法在NTU RGB+D 60数据集上的动作识别性能对比
Tab. 4 Comparison of action recognition performance of different deep learning algorithms on NTU RGB+D 60 dataset
算法 | Top-1准确率 | Top-5准确率 | 参数量/106 |
---|---|---|---|
ST-GCN[ | 0.889 5 | 0.987 8 | 3.10 |
CTR-GCN[ | 0.896 0 | 0.989 3 | 1.95 |
MS-G3D[ | 0.913 0 | 0.993 8 | 14.28 |
ST-GCN++[ | 0.892 6 | 0.984 8 | 1.39 |
AGCN[ | 0.886 0 | 0.985 1 | 3.50 |
2s-AGCN[ | 0.919 5 | 0.992 6 | 3.50 |
PoseConv3D[ | 0.934 7 | 0.995 4 | 2.00 |
3D-GCN | 0.942 4 | 0.989 1 | 2.46 |
[1] | ZHENG Y, LI R, LI S, et al. A review on serious games for ADHD[EB/OL]. [2024-08-10]. . |
[2] | 赵健翔,吴振起,王雪峰,等. 基于机器学习的注意力缺陷多动障碍风险预测研究[J]. 中国中西医结合儿科学, 2024, 16(2): 130-136. |
ZHAO J X, WU Z Q, WANG X F, et al. Risk prediction of attention deficit hyperactivity disorder based on machine learning [J]. Chinese Pediatrics of Integrated Traditional and Western Medicine, 2024, 16(2): 130-136. | |
[3] | 卜晓鸥,王耀,杜亚雯,等. 机器学习在发展性阅读障碍儿童早期筛查中的应用[J]. 心理科学进展, 2023, 31(11): 2092-2105. |
BU X O, WANG Y, DU Y W, et al. Application of machine learning in early screening of children with dyslexia [J]. Advances in Psychological Science, 2023, 31(11): 2092-2105. | |
[4] | 罗杰,何凡,郑毅. 视频游戏应用于儿童注意缺陷多动障碍评估与治疗的系统性综述[J]. 发育医学电子杂志, 2023, 11(6): 401-410. |
LUO J, HE F, ZHENG Y. Video games for the assessment and treatment of attention deficit hyperactivity disorder in children: a systematic review [J]. Journal of Developmental Medicine (Electronic Version), 2023, 11(6): 401-410. | |
[5] | 向维. ADHD诊疗新进展综述[J]. 临床医学前沿, 2023, 5(6): 138-140. |
XIANG W. A review on new progress in ADHD diagnosis and treatment [J]. Frontiers of Clinical Medicine, 2023, 5(6): 138-140. | |
[6] | JAISWAL S, VALSTAR M F, GILLOTT A, et al. Automatic detection of ADHD and ASD from expressive behaviour in RGBD data [C]// Proceedings of the 12th IEEE International Conference on Automatic Face and Gesture Recognition. Piscataway: IEEE, 2017: 762-769. |
[7] | DENG S, PRASSE P, REICH D R, et al. Detection of ADHD based on eye movements during natural viewing [C]// Proceedings of the 2022 European Conference on Machine Learning and Knowledge Discovery in Databases, LNCS 13718. Cham: Springer, 2023: 403-418. |
[8] | OUYANG C S, CHIU Y H, CHIANG C T, et al. Evaluating therapeutic effects of ADHD medication objectively by movement quantification with a video-based skeleton analysis [J]. International Journal of Environmental Research and Public Health, 2021, 18(17): No.9363. |
[9] | DEY S, RAO A R, SHAH M. Exploiting the brain’s network structure for automatic identification of ADHD subjects [EB/OL]. [2024-05-12]. . |
[10] | SIMS C. Highly accurate FMRI ADHD classification using time distributed multi modal 3D-CNNs [EB/OL]. [2024-02-13].. |
[11] | ULUYAGMUR-OZTURK M, ARMAN A R, YILMAZ S S, et al. ADHD and ASD classification based on emotion recognition data[C]// Proceedings of the 15th IEEE International Conference on Machine Learning and Applications. Piscataway: IEEE, 2016: 810-813. |
[12] | AMADO-CABALLERO P, CASASECA-DE-LA-HIGUERA P, ALBEROLA-LOPEZ S, et al. Objective ADHD diagnosis using convolutional neural networks over daily-life activity records [J]. IEEE Journal of Biomedical and Health Informatics, 2020, 24(9): 2690-2700. |
[13] | ZHANG-JAMES Y, RAZAVI A S, HOOGMAN M, et al. Machine learning and MRI-based diagnostic models for ADHD: are we there yet? [J]. Journal of Attention Disorders, 2023, 27(4): 335-353. |
[14] | ALCHALABI A E, SHIRMOHAMMADI S, EDDIN A N, et al. FOCUS: detecting ADHD patients by an EEG-based serious game[J]. IEEE Transactions on Instrumentation and Measurement, 2018, 67(7): 1512-1520. |
[15] | PENG J, DEBNATH M, BISWAS A K. Efficacy of novel summation-based synergetic artificial neural network in ADHD diagnosis [J]. Machine Learning with Applications, 2021, 6: No.100120. |
[16] | OCHAB J K, GERC K, FAFROWICZ M, et al. Classifying attention deficit hyperactivity disorder in children with non-linearities in actigraphy [EB/OL]. [2024-05-03].. |
[17] | CHOI M T, YEOM J, SHIN Y, et al. Robot-assisted ADHD screening in diagnostic process [J]. Journal of Intelligent and Robotic Systems, 2019, 95(2): 351-363. |
[18] | ANDRIKOPOULOS D, VASSILIOU G, FATOUROS P, et al. Machine learning-enabled detection of attention-deficit/hyperactivity disorder with multimodal physiological data: a case-control study [J]. BMC Psychiatry, 2024, 24: No.547. |
[19] | ALSHARIF N, AL-ADHAILEH M H, ALSUBARI S N, et al. ADHD diagnosis using text features and predictive machine learning and deep learning algorithms [J]. Journal of Disability Research, 2024, 3(7): No.0082. |
[20] | LIU J, SHAHROUDY A, XU D, et al. Skeleton-based action recognition using spatio-temporal LSTM network with trust gates[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(12): 3007-3021. |
[21] | 吕艾为. 基于机器学习的ADHD脑机接口辅助诊断的相关问题研究[D]. 鞍山:辽宁科技大学, 2022. |
LYU A W. Research on related problems of ADHD brain-computer interface aided diagnosis based on machine learning [D]. Anshan: University of Science and Technology Liaoning, 2022. | |
[22] | CHEN Y, ZHANG Z, YUAN C, et al. Channel-wise topology refinement graph convolution for skeleton-based action recognition[C]// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 13339-13348. |
[23] | LIU Z, ZHANG H, CHEN Z, et al. Disentangling and unifying graph convolutions for skeleton-based action recognition [C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 140-149. |
[24] | DUAN H, WANG J, CHEN K, et al. PYSKL: towards good practices for skeleton action recognition [C]// Proceedings of the 30th ACM International Conference on Multimedia. New York: ACM, 2022: 7351-7354. |
[25] | LI W, LIU M, LIU H, et al. GraphMLP: a graph MLP-like architecture for 3D human pose estimation [J]. Pattern Recognition, 2025, 158: No.110925. |
[26] | DUAN H, ZHAO Y, CHEN K, et al. Revisiting skeleton-based action recognition [C]// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 2959-2968. |
[27] | YAN S, XIONG Y, LIN D. Spatial temporal graph convolutional networks for skeleton-based action recognition [C]// Proceedings of the 32nd AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2018: 7444-7452. |
[28] | SHI L, ZHANG Y, CHENG J, et al. Two-stream adaptive graph convolutional networks for skeleton-based action recognition [C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 12018-12027. |
[1] | 李自亮, 朱广丽, 张玉雷, 刘佳佳, 焦熠璇, 张顺香. 集成句法与情感知识的方面级情感分析模型[J]. 《计算机应用》唯一官方网站, 2025, 45(6): 1724-1731. |
[2] | 颜文婧, 王瑞东, 左敏, 张青川. 基于风味嵌入异构图层次学习的食谱推荐模型[J]. 《计算机应用》唯一官方网站, 2025, 45(6): 1869-1878. |
[3] | 陈满, 杨小军, 杨慧敏. 基于图卷积网络和终点诱导的行人轨迹预测[J]. 《计算机应用》唯一官方网站, 2025, 45(5): 1480-1487. |
[4] | 王泉, 陆啟想, 施珮. 用于交通流量预测的多图扩散注意力网络[J]. 《计算机应用》唯一官方网站, 2025, 45(5): 1472-1479. |
[5] | 党伟超, 宋楚君, 高改梅, 刘春霞. 基于级联残差图卷积网络的多行为推荐[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1223-1231. |
[6] | 富坤, 应世聪, 郑婷婷, 屈佳捷, 崔静远, 李建伟. 面向小样本节点分类的图数据增强方法[J]. 《计算机应用》唯一官方网站, 2025, 45(2): 392-402. |
[7] | 王丽芳, 吴荆双, 尹鹏亮, 胡立华. 基于注意力机制和能量函数的动作识别算法[J]. 《计算机应用》唯一官方网站, 2025, 45(1): 234-239. |
[8] | 薛桂香, 王辉, 周卫峰, 刘瑜, 李岩. 基于知识图谱和时空扩散图卷积网络的港口交通流量预测[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2952-2957. |
[9] | 庞川林, 唐睿, 张睿智, 刘川, 刘佳, 岳士博. D2D通信系统中基于图卷积网络的分布式功率控制算法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2855-2862. |
[10] | 刘禹含, 吉根林, 张红苹. 基于骨架图与混合注意力的视频行人异常检测方法[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2551-2557. |
[11] | 李欢欢, 黄添强, 丁雪梅, 罗海峰, 黄丽清. 基于多尺度时空图卷积网络的交通出行需求预测[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2065-2072. |
[12] | 黎施彬, 龚俊, 汤圣君. 基于Graph Transformer的半监督异配图表示学习模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1816-1823. |
[13] | 吕锡婷, 赵敬华, 荣海迎, 赵嘉乐. 基于Transformer和关系图卷积网络的信息传播预测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1760-1766. |
[14] | 高龙涛, 李娜娜. 基于方面感知注意力增强的方面情感三元组抽取[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1049-1057. |
[15] | 杨先凤, 汤依磊, 李自强. 基于交替注意力机制和图卷积网络的方面级情感分析模型[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1058-1064. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||