计算机应用 ›› 2020, Vol. 40 ›› Issue (10): 2811-2816.DOI: 10.11772/j.issn.1001-9081.2020020256

• 人工智能 • 上一篇    下一篇

基于FPGA的卷积神经网络定点加速

雷小康1,2, 尹志刚2, 赵瑞莲1   

  1. 1. 北京化工大学 信息科学与技术学院, 北京 100029;
    2. 中国科学院 自动化研究所, 北京 100190
  • 收稿日期:2020-03-16 修回日期:2020-04-22 出版日期:2020-10-10 发布日期:2020-05-15
  • 通讯作者: 尹志刚
  • 作者简介:雷小康(1994-),男,河南周口人,硕士研究生,主要研究方向:深度学习、卷积神经网络模型压缩与加速;尹志刚(1976-),男,湖北天门人,研究员,博士,主要研究方向:人工智能、处理器芯片架构;赵瑞莲(1964-),女,山西忻州人,教授,博士,主要研究方向:软件测试、软件可靠性。
  • 基金资助:
    国家自然科学基金资助项目(61672085)。

FPGA-based convolutional neural network fixed-point acceleration

LEI Xiaokang1,2, YIN Zhigang2, ZHAO Ruilian1   

  1. 1. School of Information Science and Technology, Beijing University of Chemical Technology, Beijing 100029, China;
    2. Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
  • Received:2020-03-16 Revised:2020-04-22 Online:2020-10-10 Published:2020-05-15
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (61672085).

摘要: 针对卷积神经网络(CNN)在资源受限的硬件设备上运行功耗高及运行慢的问题,提出一种基于现场可编程门阵列(FPGA)的CNN定点计算加速方法。首先提出一种定点化方法,并且每层卷积设计不同的尺度参数,使用相对散度确定位宽的长度,以减小CNN参数的存储空间,而且研究不同量化区间对CNN精度的影响;其次,设计参数复用方法及流水线计算方法来加速卷积计算。为验证CNN定点化后的加速效果,采用了人脸和船舶两个数据集进行验证。结果表明,相较于传统的浮点卷积计算,所提方法在保证CNN精度损失很小的前提下,当权值参数和输入特征图参数量化到7-bit时,在人脸识别CNN模型上的压缩后的权重参数文件大小约为原来的22%,卷积计算加速比为18.69,同时使FPGA中的乘加器的利用率达94.5%。实验结果表明了该方法可以提高卷积计算速度,并且能够高效利用FPGA硬件资源。

关键词: 卷积神经网络, 定点量化, 现场可编程门阵列, 模型压缩, YOLO模型

Abstract: Aiming at the problem of high running power consumption and slow operation of Convolutional Neural Network (CNN) on resource-constrained hardware devices, a method for accelerating fixed-point computation of CNN based on Field Programmable Gate Array (FPGA) was proposed. First, a fixed-point processing method was proposed. In order to reduce the storage space of the CNN parameters, different scale parameters were designed for different convolution layers and the relative divergence was used to determine the bit width length. The effect of different quantization intervals on the accuracy of CNN was studied. Then, the parameter multiplexing method and the pipeline calculation method were designed to accelerate the convolution calculation. In order to verify the acceleration effect of CNN after fixed-point processing, two datasets of face and ship were used for verification. Compared with the traditional floating-point convolution computation, on the premise of ensuring that the accuracy loss of the CNN is small, when the weight parameters and the input feature map parameters are quantized to 7-bit, on the face recognition CNN model, the proposed method has the compressed weight parameter file size of about 22% of the origin, and the convolution calculation speedup is 18.69. At the same time, the method makes the utilization rate of the multiplier-accumulator in FPGA reach 94.5%. Experimental results show that the proposed method can improve the speed of convolution calculation, and efficiently use FPGA hardware resources.

Key words: Convolutional Neural Network (CNN), fixed-point quantization, Field Programmable Gate Array (FPGA), model compression, YOLO model

中图分类号: