《计算机应用》唯一官方网站 ›› 2025, Vol. 45 ›› Issue (4): 1271-1284.DOI: 10.11772/j.issn.1001-9081.2024040561

• 多媒体计算与计算机仿真 • 上一篇    下一篇

不变性全局稀疏轮廓点表征的运动行人检测神经网络

赵轻轻1,2, 胡滨1,2,3()   

  1. 1.公共大数据国家重点实验室(贵州大学),贵阳 550025
    2.贵州大学 计算机科学与技术学院,贵阳 550025
    3.贵州大学 人工智能研究院,贵阳 550025
  • 收稿日期:2024-05-07 修回日期:2024-09-24 接受日期:2024-09-26 发布日期:2025-04-08 出版日期:2025-04-10
  • 通讯作者: 胡滨
  • 作者简介:赵轻轻(1995—),女,贵州仁怀人,硕士研究生,主要研究方向:计算智能、计算机视觉;
  • 基金资助:
    国家自然科学基金资助项目(62066006)

Moving pedestrian detection neural network with invariant global sparse contour point representation

Qingqing ZHAO1,2, Bin HU1,2,3()   

  1. 1.State Key Laboratory of Public Big Data (Guizhou University),Guiyang Guizhou 550025,China
    2.College of Computer Science and Technology,Guizhou University,Guiyang Guizhou 550025,China
    3.Artificial Intelligence Research Institute,Guizhou University,Guiyang Guizhou 550025,China
  • Received:2024-05-07 Revised:2024-09-24 Accepted:2024-09-26 Online:2025-04-08 Published:2025-04-10
  • Contact: Bin HU
  • About author:ZHAO Qingqing, born in 1995, M. S. candidate. Her research interests include computing intelligence, computer vision.
  • Supported by:
    This work is partially supported by National Natural Science Foundation of China(62066006)

摘要:

行人作为非刚性物体,对它的视觉特征进行有效的不变表示是提高识别效果的关键。在自然视觉场景中,运动行人通常会发生尺度、背景、姿态等变化,这对用现有技术提取这些不规则特征造成阻碍。针对该问题,基于哺乳动物视网膜神经结构特性,探究运动行人不变性识别问题,并提出一种适用于视觉场景的运动行人检测神经网络(MPDNN)。MPDNN包括2个神经模块:突触前网络和突触后网络。其中,突触前网络感知表征运动目标的低阶视觉运动线索,并提取目标的二值化视觉信息;突触后网络借助生物视觉系统中的稀疏不变响应特性,利用目标轮廓在连续改变形状后较大凹凸区域之间的位置关系不变特性,从低阶运动线索中编码平稳变化的视觉特征以构建行人不变表征。实验结果表明,MPDNN在公共数据集CUHK Avenue与EPFL上达到了96.96%的跨域检测准确率,比SOTA (State Of The Art)模型高4.52个百分点;在尺度、运动姿势变化数据集上也表现了较好的鲁棒性,准确率分别达到了89.48%与91.45%。以上实验结果验证了生物不变性物体识别机制在运动行人检测中的有效性。

关键词: 目标检测, 运动行人, 不变性物体识别, 视觉运动感知, 视网膜神经

Abstract:

As pedestrians are non-rigid objects, effective invariant representation of their visual features is the key to improving recognition performance. In natural visual scenes, moving pedestrians often undergo changes in scale, background, and pose, which creates obstacles for existing techniques for extracting these irregular features. The issue was addressed by exploring the problem of invariant recognition of moving pedestrians based on the neural structural characteristics of mammalian retinas, and a Moving Pedestrian Detection Neural Network (MPDNN) was proposed for visual scenes. MPDNN was composed of two neural modules: the presynaptic network and the postsynaptic network. The presynaptic network was used to perceive low-level visual motion cues representing the moving object and extract the object’s binarized visual information, and the postsynaptic network was utilized to take advantage of the sparse invariant response properties in the biological visual system and use the invariant relationship between large concave and convex regions of the object’s contour after continuous shape changes, then, stably changed visual features were encoded from low-level motion cues to build invariant representations of pedestrians. Experimental results show that MPDNN achieves a 96.96% cross-domain detection accuracy on the public datasets CUHK Avenue and EPFL, which is 4.52 percentage points higher than the SOTA (State of the Art) model; MPDNN demonstrates good robustness on scale and motion posture variation datasets, with accuracy of 89.48% and 91.45%, respectively. The effectiveness of the biological invariant object recognition mechanism in moving pedestrian detection was validated by the above experimental results.

Key words: object detection, moving pedestrian, invariant object recognition, visual motion perception, retinal nerve

中图分类号: