Journal of Computer Applications ›› 2025, Vol. 45 ›› Issue (7): 2325-2332.DOI: 10.11772/j.issn.1001-9081.2024070961

• Multimedia computing and computer simulation • Previous Articles     Next Articles

Pedestrian detection algorithm based on multi-view information

Haoyu LIU1, Pengwei KONG2, Yaoli WANG3(), Qing CHANG3   

  1. 1.College of Integrated Circuits,Taiyuan University of Technology,Jinzhong Shanxi 030600,China
    2.Shanxi Weitao Food Technology Company Limited,Linfen Shanxi 041000,China
    3.College of Electronic Information Engineering,Taiyuan University of Technology,Jinzhong Shanxi 030600,China
  • Received:2024-07-10 Revised:2024-09-13 Accepted:2024-09-26 Online:2025-07-10 Published:2025-07-10
  • Contact: Yaoli WANG
  • About author:LIU Haoyu, born in 1999, M. S. candidate. His research interests include computer vision, object detection, object tracking.
    KONG Pengwei, born in 1978, engineer. His research interests include information security, big data.
    WANG Yaoli, born in 1965, Ph. D., associate professor. His research interests include machine vision, computational intelligence and optimal modeling, wireless sensor networks.
    CHANG Qing, born in 1975, Ph. D., associate professor. His research interests include embedded systems, audio and video (AVS coding), information system design theory.
  • Supported by:
    Key Research and Development Program of Shanxi(201903D321003);Enterprise Commissioned Development Program(RH2400001203)

基于多视角信息的行人检测算法

刘皓宇1, 孔鹏伟2, 王耀力3(), 常青3   

  1. 1.太原理工大学 集成电路学院,山西 晋中 030600
    2.山西伟涛食品科技股份有限公司,山西 临汾 041000
    3.太原理工大学 电子信息工程学院,山西 晋中 030600
  • 通讯作者: 王耀力
  • 作者简介:刘皓宇(1999—),男,陕西宝鸡人,硕士研究生,主要研究方向:计算机视觉、目标检测、目标跟踪
    孔鹏伟(1978—),男,山西侯马人,工程师,主要研究方向:信息安全、大数据
    王耀力(1965—),男,河北定州人,副教授,博士,主要研究方向:机器视觉、计算智能与最优化建模、无线传感器网络 tyutyjs0901@163.com
    常青(1975—),男,山西太原人,副教授,博士,主要研究方向:嵌入式系统、音视频(AVS编码)、信息系统设计理论。
  • 基金资助:
    山西省重点研发计划项目(201903D321003);企业委托开发项目(RH2400001203)

Abstract:

To address the issues of false detection and missed detection caused by severe object occlusion and the lack of consideration of relationships among multiple views in the existing multi-view pedestrian detection algorithms, an improved multi-view pedestrian detection algorithm based on MVDeTr (MultiView Detection with shadow Transformer) algorithm was proposed. Firstly, during the feature extraction phase, a view enhancement module — VEM (View Enhancement Module) was designed to enhance important views by focusing on relationships among different views. Secondly, in the process of introducing multi-view information into a single view, an Efficient Multi-scale Attention (EMA) module was added to establish short-distance and long-distance dependencies, thereby improving the detection performance. Finally, based on the Shadow Transformer module in the original baseline algorithm, a new multi-view information processing module — EST (Efficient Shadow Transformer) was designed to reduce the use of redundant information in multiple views while maintaining detection effect. Experimental results show that the proposed algorithm enhances the main detection metric MODA (Multiple Object Detection Accuracy) by 1.8 percentage points, the detection metric MODP (Multiple Object Detection Precision) by 0.6 percentage points, and Recall by 1.8 percentage points on Wildtrack dataset compared to the original MVDeTr algorithm, demonstrating the effectiveness of the proposed algorithm in multi-view pedestrian detection tasks.

Key words: multi-view, pedestrian detection, MVDeTr (MultiView Detection with shadow Transformer), attention mechanism, feature enhancement

摘要:

针对现有的多视角行人检测算法中因目标遮挡严重以及未关注多视角之间关系而导致的错检和漏检等问题,提出一种基于MVDeTr(MultiView Detection with shadow Transformer)算法改进的多视角行人检测算法。首先,在特征提取阶段,设计一个视角特征增强模块VEM(View Enhancement Module),通过关注不同视角之间的关系实现对重要视角的增强;其次,在将多视角信息引入单视角的过程中,加入高效多尺度注意力(EMA)模块建立短距离和长距离依赖关系,从而提升检测效果;最后,在原始基线算法Shadow Transformer模块的基础上,设计一种新的多视角信息处理模块EST(Efficient Shadow Transformer),在保持检测效果的基础上减少多视角中冗余信息的使用。实验结果表明,在Wildtrack数据集上与原始MVDeTr算法相比,所提算法的主要检测指标MODA(Multiple Object Detection Accuracy)提升了1.8个百分点,检测指标MODP(Multiple Object Detection Precision)提升了0.6个百分点,召回率提升了1.8个百分点。可见,所提算法能很好地应用于多视角行人检测任务。

关键词: 多视角, 行人检测, MVDeTr, 注意力机制, 特征增强

CLC Number: