计算机应用 ›› 2019, Vol. 39 ›› Issue (1): 192-198.DOI: 10.11772/j.issn.1001-9081.2018051134

• 人工智能 • 上一篇    下一篇

基于小型Zynq SoC硬件加速的改进TINY YOLO实时车辆检测算法实现

张雲轲, 刘丹   

  1. 电子科技大学 电子科学技术研究院, 成都 611731
  • 收稿日期:2018-06-01 修回日期:2018-06-21 出版日期:2019-01-10 发布日期:2019-01-21
  • 通讯作者: 刘丹
  • 作者简介:张雲轲(1993-),男,四川成都人,硕士研究生,主要研究方向:人工智能、高性能计算、深度神经网络、模式识别;刘丹(1969-),男,四川成都人,副教授,博士,主要研究方向:网络安全、Web数据挖掘、图像处理。

Real-time implementation of improved TINY YOLO vehicle detection algorithm based on Zynq SoC hardware acceleration

ZHANG Yunke, LIU Dan   

  1. Research Institute of Electronic Science and Technology, University of Electronic Science and Technology of China, Chengdu Sichuan 611731, China
  • Received:2018-06-01 Revised:2018-06-21 Online:2019-01-10 Published:2019-01-21

摘要: 针对TINY YOLO车辆检测算法计算量过大,且在小型嵌入式系统中难以达到实时检测要求的问题。利用小型Zynq SoC系统的架构优势以及TINY YOLO的网络权值中存在大量接近零的权值参数这一特点,提出硬件并行加速的改进算法,称为浓缩小型深度网络(Xerantic-TINY YOLO,X-TINY YOLO)车辆检测算法。首先对TINY YOLO中网络结构进行压缩;其次采用高效多级流水线、流水线内全并行的方式对卷积计算部分进行算法加速;最后提出与网络结构相配合的数据切割和传输方案。实验结果表明,X-TINY YOLO仅消耗50%的片内硬件资源,可在相对于GPU和CPU性价比更高更适合嵌入式场景的Zynq SoC系统上实现,且其检测速度达到24帧/s,满足车辆检测的实时性要求。

关键词: 车辆检测, 机器视觉, TINY YOLO, Zynq-7020, 硬件加速

Abstract: TINY YOLO (TINY You Only Look Once) vehicle detection algorithm requires much amount of calculation which makes it difficult to achieve real-time detection in small embedded systems. Because plenty of zero values exist in a network weight matrix which makes the network a sparse structure, an improved version of TINY YOLO vehicle detection algorithm, called Xerantic-TINY YOLO (X-TINY YOLO), was proposed and accelerated in parallel way using architectural advantages of small Zynq SoC system. Original network structure of TINY YOLO was compressed and the operations of convolution steps were accelerated in parallel by using high efficient multistage pipeline. All multiply-add operations were concurrently executed within each stage of pipeline. By matching network structure, a method of data segmentation and transfer was also proposed. The experimental results show that, X-TINY YOLO only consumes 50% hardware resources on chip, and it can be implemented on small Zynq SoC systems which have higher performance-price ratio than GPU and CPU and is suitable for embedded implementation scenes. Its detection speed reaches 24 frames per second, which meets the requirement of real-time vehicle detection.

Key words: vehicle detection, machine vision, TINY You Only Look Once (TINY YOLO), Zynq-7020, hardware acceleration

中图分类号: