Pedestrian fall detection algorithm in complex scenes

doi:10.11772/j.issn.1001-9081.2022050754

Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (6): 1811-1817.DOI: 10.11772/j.issn.1001-9081.2022050754

Special Issue: 人工智能

• Artificial intelligence • Previous Articles Next Articles

Pedestrian fall detection algorithm in complex scenes

Ke FANG, Rong LIU(), Chiyu WEI, Xinyue ZHANG, Yang LIU

College of Physical Science and Technology，Central China Normal University，Wuhan Hubei 430079，China

Received:2022-05-27 Revised:2022-09-19 Accepted:2022-10-08 Online:2023-06-08 Published:2023-06-10
Contact: Rong LIU
About author:FANG Ke， born in 1999， M. S. candidate. His research interests include deep learning， object detection.
WEI Chiyu， born in 1998， M. S. candidate. His research interests include deep learning， object detection.
ZHANG Xinyue， born in 1997， M. S. candidate. Her research interests include intelligent information processing， sentiment identification.
LIU Yang， born in 1999， M. S. candidate. His research interests include intelligent information processing， deep learning.
Supported by:
National Social Science Foundation of China(19BTQ005)

复杂场景下的行人跌倒检测算法

方可, 刘蓉(), 魏驰宇, 张心月, 刘杨

华中师范大学物理科学与技术学院，武汉 430079

通讯作者: 刘蓉
作者简介:方可（1999—），男，河南周口人，硕士研究生，主要研究方向：深度学习、目标检测
刘蓉（1969—），女，湖南安化人，副教授，博士，主要研究方向：智能信息处理、人工智能Email：liurong@ccnu.edu.cn
魏驰宇（1998—），男，河南周口人，硕士研究生，主要研究方向：深度学习、目标检测
张心月（1998—），女，河南周口人，硕士研究生，主要研究方向：智能信息处理、情感识别
刘杨（1999—），男，湖南长沙人，硕士研究生，主要研究方向：智能信息处理、深度学习。
基金资助:
国家社会科学基金资助项目(19BTQ005)

Abstract

Abstract:

With the deepening of population aging， fall detection has become a key issue in the medical and health field. Concerning the low accuracy of fall detection algorithms in complex scenes， an improved fall detection model PDD-FCOS （PVT DRFPN DIoU-Fully Convolutional One-Stage object detection） was proposed. Pyramid Vision Transformer （PVT） was introduced into the backbone network of baseline FCOS algorithm to extract richer semantic information without increasing the amount of computation. In the feature information fusion stage， Double Refinement Feature Pyramid Networks （DRFPN） were inserted to learn the positions and other information of sampling points between feature maps more accurately， and more accurate semantic relationship between feature channels was captured by context information to improve the detection performance. In the training stage， the bounding box regression was carried out by the Distance Intersection Over Union （DIoU） loss. By optimizing the distance between the prediction box and the center point of the object box， the regression box was made to converge faster and more accurately， which improved the accuracy of the fall detection algorithm effectively. Experimental results show that on the open-source dataset Fall detection Database， the mean Average Precision （mAP） of the proposed model reaches 82.2%， which is improved by 6.4 percentage points compared with that of the baseline FCOS algorithm， and the proposed algorithm has accuracy improvement and better generalization ability compared with other state-of-the-art fall detection algorithms.

Key words: object detection, pedestrian fall detection, Pyramid Vision Transformer (PVT), attention mechanism, Double Refinement Feature Pyramid Networks (DRFPN), Distance Intersection over Union (DIoU)

摘要：

随着人口老龄化程度的不断深化，跌倒检测成为医疗与健康领域的一个关键问题。针对复杂场景下跌倒检测算法准确率偏低的问题，提出一种改进的跌倒检测模型——PDD-FCOS（PVT DRFPN DIoU-Fully Convolutional One-Stage object detection）。在基准FCOS算法的骨干网络中引入金字塔视觉转换器（PVT），以不增加计算量为前提提取更丰富的语义信息；在特征信息融合阶段插入双重细化特征金字塔网络（DRFPN），更加准确地学习特征图之间采样点的位置和其他信息，并通过上下文信息捕获特征通道之间更准确的语义关系，从而提升检测性能；训练阶段采用距离交并比（DIoU）损失进行边界框回归，通过优化预测框与目标框中心点的距离，使回归框收敛得更快更准确，从而有效提高跌倒检测算法的准确率。实验结果表明，所提模型在开源数据集Fall detection Database上平均精确度均值（mAP）达到82.2%，与基准FCOS算法相比，所提算法的mAP提升了6.4个百分点，且相较于其他主流目标检测算法有精度上的提升以及更好的泛化能力。

关键词: 目标检测, 行人跌倒检测, 金字塔视觉转换器, 注意力机制, 双重细化特征金字塔网络, 距离交并比

CLC Number:

TP391.4

Ke FANG, Rong LIU, Chiyu WEI, Xinyue ZHANG, Yang LIU. Pedestrian fall detection algorithm in complex scenes[J]. Journal of Computer Applications, 2023, 43(6): 1811-1817.

方可, 刘蓉, 魏驰宇, 张心月, 刘杨. 复杂场景下的行人跌倒检测算法[J]. 《计算机应用》唯一官方网站, 2023, 43(6): 1811-1817.

Figures/Tables 10

References 25

1	宁吉喆. 第七次全国人口普查主要数据情况［J］. 中国统计， 2021（5）： 4-5. 10.18356/9789210058025c044
	NING J Z. Main data of the seventh National Census［J］. China Statistics， 2021（5）： 4-5. 10.18356/9789210058025c044
2	高茂龙，宋岳涛. 中国老年人跌倒发生率meta分析［J］. 北京医学， 2014， 36（10）：796-798.
	GAO M L， SONG Y T. Meta-analysis of the prevalence of fall in elderly in China［J］. Beijing Medical Journal， 2014， 36（10）： 796-798.
3	张庆来，张林. 老年人跌倒的研究进展［J］. 中国老年学杂志， 2016， 36（1）：248-249. 10.3969/j.issn.1005-9202.2016.01.112
	ZHANG Q L， ZHANG L. Advances in falls in the elderly［J］. Chinese Journal of Gerontology， 2016， 36（1）：248-249. 10.3969/j.issn.1005-9202.2016.01.112
4	MUBASHIR M， SHAO L， SEED L. A survey on fall detection： principles and approaches［J］. Neurocomputing， 2013， 100： 144-152. 10.1016/j.neucom.2011.09.037
5	REN S Q， HE K M， GIRSHICK R， et al. Faster R-CNN： towards real-time object detection with region proposal networks［C］// Proceedings of the 28th International Conference on Neural Information Processing Systems. Cambridge： MIT Press， 2015，1：91-99.
6	REDMON J， DIVVALA S， GIRSHICK R， et al. You only look once： unified， real-time object detection［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 779-788. 10.1109/cvpr.2016.91
7	LIU W， ANGUELOV D， ERHAN D， et al. SSD： single shot multiBox detector［C］// Proceedings of the 2016 European Conference on Computer Vision， LNCS 9905. Cham： Springer， 2016： 21-37.
8	ZHOU X Y， WANG D Q， KRÄHENBÜHL P. Objects as points［EB/OL］. （2019-04-25）［2022-04-07］.. 10.5260/chara.21.2.8
9	TIAN Z， SHEN C H， CHEN H， et al. FCOS： fully convolutional one-stage object detection ［C］// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2019： 9626-9635. 10.1109/iccv.2019.00972
10	FENG Q， GAO C Q， WANG L， et al. Spatio-temporal fall event detection in complex scenes using attention guided LSTM［J］. Pattern Recognition Letters， 2020， 130： 242-249. 10.1016/j.patrec.2018.08.031
11	REDMON J， FARHADI A. YOLOv3： an incremental improvement ［EB/OL］. （2018-04-08）［2022-04-07］.. 10.1109/cvpr.2017.690
12	朱艳，张亚萍，利曙生，等. 基于深度视觉传感器和卷积神经网络的跌倒检测算法［J］. 光学技术， 2021， 47（1）：56-61.
	ZHU Y， ZHANG Y P， LI S S， et al. Fall detection algorithm based on depth vision sensor and neural network［J］. Optical Technique， 2021， 47（1）：56-61.
13	CHEN Y， LI W T， WANG L， et al. Vision-based fall event detection in complex background using attention guided Bi-directional LSTM ［J］. IEEE Access， 2020， 8： 161337-161348. 10.1109/access.2020.3021795
14	HE K M， GKIOXARI G， DOLLÁR P， et al. Mask R-CNN［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017：2980-2988. 10.1109/iccv.2017.322
15	马露，裴伟，朱永英，等. 基于深度学习的跌倒行为识别［J］. 计算机科学， 2019， 46（9）：106-112. 10.11896/j.issn.1002-137X.2019.09.014
	MA L， PEI W， ZHU Y Y， et al. Fall action recognition based on deep learning ［J］. Computer Science， 2019， 46（9）：106-112. 10.11896/j.issn.1002-137X.2019.09.014
16	CAI X， LI S Y， LIU X Y， et al. Vision-based fall detection with multi-task hourglass convolutional auto-encoder［J］. IEEE Access， 2020， 8： 44493-44502. 10.1109/access.2020.2978249
17	曹建荣，吕俊杰，武欣莹，等. 融合运动特征和深度学习的跌倒检测算法［J］. 计算机应用， 2021， 41（2）：583-589. 10.11772/j.issn.1001-9081.2020050705
	CAO J R， LYU J J， WU X Y， et al. Fall detection algorithm integrating motion features and deep learning［J］. Journal of Computer Applications， 2021， 41（2）： 583-589. 10.11772/j.issn.1001-9081.2020050705
18	GARCÍA E， VILLAR M， FÁÑEZ M， et al. Towards effective detection of elderly falls with CNN-LSTM neural networks［J］. Neurocomputing， 2022， 500：231-240. 10.1016/j.neucom.2021.06.102
19	WANG B H， YU J， WANG K， et al. Fall detection based on dual-channel feature integration ［J］. IEEE Access， 2020， 8： 103443-103453. 10.1109/access.2020.2999503
20	WANG W H， XIE E Z， LI X， et al. Pyramid vision Transformer： a versatile backbone for dense prediction without convolutions［C］// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2021： 548-558. 10.1109/iccv48922.2021.00061
21	HE K M， ZHANG X Y， REN S Q， et al. Identity mappings in deep residual networks ［C］// Proceedings of the 2016 European Conference on Computer Vision， LNCS 9908. Cham： Springer， 2016： 630-645.
22	MA J L， CHEN B. Dual refinement feature pyramid networks for object detection ［EB/OL］. （2020-12-04）［2022-04-07］..
23	ZHENG Z H， WANG P， LIU W， et al. Distance-IoU loss： faster and better learning for bounding box regression ［C］// Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto， CA： AAAI Press， 2020： 12993-13000. 10.1609/aaai.v34i07.6999
24	WANG W H， XIE E Z， LI X， et al. PVTv2： improved baselines with pyramid vision transformer ［J］. Computational Visual Media， 2022， 8（3）： 415-424. 10.1007/s41095-022-0274-8
25	VASWANI A， SHAZEER N， PARMAR N， et al. Attention is all you need ［C］// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2017： 6000-6010.

实验环境	版本
操作系统	Linux
GPU	Tesla P100-PCIE-16GB
CUDA	11.0
CUDNN	7.6.5
深度学习框架	PyTorch 1.7.0
平台编译器	Python 3.7.9

实验环境	版本
操作系统	Linux
GPU	Tesla P100-PCIE-16GB
CUDA	11.0
CUDNN	7.6.5
深度学习框架	PyTorch 1.7.0
平台编译器	Python 3.7.9

算法	骨干网络	mAP/%	参数量/10⁷	计算量/10¹¹
Faster R-CNN^［5］	ResNet50	81.3	4.114	2.066 8
Mask R-CNN^［14］	ResNet50	81.8	4.375	2.581 4
YOLOv3^［11］	Darknet53	80.3	6.152	1.938 5
FCOS^［9］	ResNet50	75.8	3.189	2.014 6
本文算法	PVT	82.2	3.755	1.929 6

算法	骨干网络	mAP/%	参数量/10⁷	计算量/10¹¹
Faster R-CNN^［5］	ResNet50	81.3	4.114	2.066 8
Mask R-CNN^［14］	ResNet50	81.8	4.375	2.581 4
YOLOv3^［11］	Darknet53	80.3	6.152	1.938 5
FCOS^［9］	ResNet50	75.8	3.189	2.014 6
本文算法	PVT	82.2	3.755	1.929 6

分组序号	PVT	DRFPN	DIoU	mAP/%
1	×	×	×	75.8
2	√	×	×	79.8
3	√	×	√	81.5
4	√	√	√	82.2

Pedestrian fall detection algorithm in complex scenes

复杂场景下的行人跌倒检测算法

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 10

References 25

Related Articles 15

Recommended Articles

Metrics

[1]	Yexin PAN, Zhe YANG. Optimization model for small object detection based on multi-level feature bidirectional fusion [J]. Journal of Computer Applications, 2024, 44(9): 2871-2877.
[2]	Zhiqiang ZHAO, Peihong MA, Xinhong HEI. Crowd counting method based on dual attention mechanism [J]. Journal of Computer Applications, 2024, 44(9): 2886-2892.
[3]	Jing QIN, Zhiguang QIN, Fali LI, Yueheng PENG. Diagnosis of major depressive disorder based on probabilistic sparse self-attention neural network [J]. Journal of Computer Applications, 2024, 44(9): 2970-2974.
[4]	Liting LI, Bei HUA, Ruozhou HE, Kuang XU. Multivariate time series prediction model based on decoupled attention mechanism [J]. Journal of Computer Applications, 2024, 44(9): 2732-2738.
[5]	Yeheng LI, Guangsheng LUO, Qianmin SU. Logo detection algorithm based on improved YOLOv5 [J]. Journal of Computer Applications, 2024, 44(8): 2580-2587.
[6]	Yingjun ZHANG, Niuniu LI, Binhong XIE, Rui ZHANG, Wangdong LU. Semi-supervised object detection framework guided by curriculum learning [J]. Journal of Computer Applications, 2024, 44(8): 2326-2333.
[7]	Kaipeng XUE, Tao XU, Chunjie LIAO. Multimodal sentiment analysis network with self-supervision and multi-layer cross attention [J]. Journal of Computer Applications, 2024, 44(8): 2387-2392.
[8]	Pengqi GAO, Heming HUANG, Yonghong FAN. Fusion of coordinate and multi-head attention mechanisms for interactive speech emotion recognition [J]. Journal of Computer Applications, 2024, 44(8): 2400-2406.
[9]	Zhonghua LI, Yunqi BAI, Xuejin WANG, Leilei HUANG, Chujun LIN, Shiyu LIAO. Low illumination face detection based on image enhancement [J]. Journal of Computer Applications, 2024, 44(8): 2588-2594.
[10]	Shangbin MO, Wenjun WANG, Ling DONG, Shengxiang GAO, Zhengtao YU. Single-channel speech enhancement based on multi-channel information aggregation and collaborative decoding [J]. Journal of Computer Applications, 2024, 44(8): 2611-2617.
[11]	Li LIU, Haijin HOU, Anhong WANG, Tao ZHANG. Generative data hiding algorithm based on multi-scale attention [J]. Journal of Computer Applications, 2024, 44(7): 2102-2109.
[12]	Song XU, Wenbo ZHANG, Yifan WANG. Lightweight video salient object detection network based on spatiotemporal information [J]. Journal of Computer Applications, 2024, 44(7): 2192-2199.
[13]	Dahai LI, Zhonghua WANG, Zhendong WANG. Dual-branch low-light image enhancement network combining spatial and frequency domain information [J]. Journal of Computer Applications, 2024, 44(7): 2175-2182.
[14]	Wenliang WEI, Yangping WANG, Biao YUE, Anzheng WANG, Zhe ZHANG. Deep learning model for infrared and visible image fusion based on illumination weight allocation and attention [J]. Journal of Computer Applications, 2024, 44(7): 2183-2191.
[15]	Xun SUN, Ruifeng FENG, Yanru CHEN. Monocular 3D object detection method integrating depth and instance segmentation [J]. Journal of Computer Applications, 2024, 44(7): 2208-2215.