《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (2): 529-535.DOI: 10.11772/j.issn.1001-9081.2022010114

• 多媒体计算与计算机仿真 • 上一篇    

基于场景先验及注意力引导的跌倒检测算法

王萍1,2(), 陈楠1, 鲁磊1,2   

  1. 1.西安交通大学 信息与通信工程学院, 西安 710049
    2.综合业务网理论及关键技术国家重点实验室(西安电子科技大学), 西安 710071
  • 收稿日期:2022-01-28 修回日期:2022-04-26 接受日期:2022-04-27 发布日期:2022-05-16 出版日期:2023-02-10
  • 通讯作者: 王萍
  • 作者简介:陈楠(1997—),女,陕西榆林人,硕士研究生,主要研究方向:深度学习、目标检测与识别
    鲁磊(1988—),男,陕西西安人,讲师,博士,CCF会员,主要研究方向:图像处理、深度学习、信号分析。

Fall detection algorithm based on scene prior and attention guidance

Ping WANG1,2(), Nan CHEN1, Lei LU1,2   

  1. 1.School of Information and Communication Engineering,Xi’an Jiaotong University,Xi’an Shaanxi 710049,China
    2.State Key Laboratory of Integrated Services Networks (Xidian University),Xi’an Shaanxi 710071,China
  • Received:2022-01-28 Revised:2022-04-26 Accepted:2022-04-27 Online:2022-05-16 Published:2023-02-10
  • Contact: Ping WANG
  • About author:CHEN Nan, born in 1997, M. S. candidate. Her research interests include deep learning, object detection and recognition.
    LU Lei, born in 1988, Ph. D., lecturer. His research interests include image processing, deep learning, signal analysis.

摘要:

已有跌倒检测工作主要关注室内场景,且大多偏重对人员身体姿态特征进行建模,而忽略了场景背景信息以及人员与地面的交互信息。针对这个问题,从实际电梯场景应用入手,提出一种基于场景先验及注意力引导的跌倒检测算法。首先,利用电梯历史数据,以高斯概率分布建模的方式从人员的活动轨迹中自动化地学习场景先验信息;随后,把场景先验信息作为空间注意力掩膜与神经网络的全局特征融合,以此聚焦地面区域的局部信息;然后,将融合后的局部特征与全局特征采用自适应加权的方式进一步聚合,从而形成更具鲁棒性和判别力的特征;最后,将特征送入由全局平均池化层和全连接层构成的分类模块中进行跌倒类别预测。在自构建的电梯场景Elevator Fall Detection和公开的UR Fall Detection数据集上的实验结果表明,所提算法的检测准确率分别达到了95.36%和99.01%,相较于网络结构复杂的ResNet50算法,分别提高了3.52个百分点和0.61个百分点。可见所构建的高斯场景先验引导的注意力机制可使网络关注地面区域的特征,更有利于对跌倒的识别,由此得到的检测模型准确率高且算法满足实时性应用要求。

关键词: 跌倒检测, 注意力机制, 高斯先验, 特征融合, 卷积神经网络, 深度学习

Abstract:

The existing fall detection works mainly focus on indoor scenes, and most of them only model people’s body posture features, ignoring background information of the scene and the interaction information between people and the ground. Aiming at the problem, from the perspective of practical application of elevator scene, a fall detection algorithm based on scene prior and attention guidance was proposed. Firstly, elevator historical data was used to automatically learn the scene prior information from people’s trajectories by Gaussian probability distribution modelling. Then, the scene information was taken as a spatial attention mask and fused with the global features of the neural network to focus on local information of the ground area. After that, the fused local and global features were further aggregated using adaptive weighting method to improve the robustness and discriminative ability of the generated features. Finally, the features were fed into a classifier module consisting of a global average pooling layer and a fully connected layer to perform the fall prediction and classification. Experimental results show that the detection accuracy of the proposed algorithm on the self-built elevator scene dataset Elevator Fall Detection Dataset and the public UR Fall Detection Dataset reached 95.36% and 99.01% respectively, which is increased by 3.52 percentage points and 0.61 percentage points respectively compared with that of ResNet50 with complicated network structure. It can be seen that proposed attention mechanism with Gaussian scene prior guidance can make the network focus on information of the ground area, which is more conducive to detect fall events. By using it, the detection model has high accuracy, and the algorithm meets the real-time application requirements.

Key words: fall detection, attention mechanism, Gaussian prior, feature fusion, Convolutional Neural Network (CNN), deep learning

中图分类号: