[1] SZEGEDY C, LIU W, JIA Y Q, et al. Going deeper with convolutions[C]//Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2015:1-9. [2] PHAM N Q,NGUYEN T S,NIEHUES J,et al. Very deep selfattention networks for end-to-end speech recognition[EB/OL].[2020-09-26]. https://www.isca-speech.org/archive/Interspeech_2019/pdfs/2702.pdf. [3] LI X H,LAI T T,WANG S Y,et al. Weighted feature pyramid networks for object detection[C]//Proceedings of the 2019 IEEE International Conference on Parallel and Distributed Processing with Applications, Big Data and Cloud Computing, Sustainable Computing and Communications, Social Computing and Networking. Piscataway:IEEE,2019:1500-1504. [4] 冯文博, 洪征, 吴礼发, 等. 基于卷积神经网络的应用层协议识别方法[J]. 计算机应用,2019,39(12):3615-3621.(FENG W B,HONG Z,WU L F,et al. Application protocol recognition method based on convolutional neural network[J]. Journal of Computer Applications,2019,39(12):3615-3621.) [5] 刘尚旺, 刘承伟, 张爱丽. 基于深度可分卷积神经网络的实时人脸表情和性别分类[J]. 计算机应用,2020,40(4):990-995. (LIU S W,LIU C W,ZHANG A L. Real-time facial expression and gender recognition based on depthwise separable convolutional neural network[J]. Journal of Computer Applications,2020,40(4):990-995.) [6] 刘伟波, 曾庆宁, 卜玉婷, 等. 基于双微阵列与卷积神经网络的语音识别方法[J]. 计算机应用,2019,39(11):3268-3273.(LIU W B,ZENG Q N,BU Y T,et al. Speech recognition method based on dual micro-array and convolutional neural network[J]. Journal of Computer Applications,2019,39(11):3268-3273.) [7] YIN Q, LI Y F, HUANG H Z, et al. FPGA-based highperformance CNN accelerator architecture with high DSP utilization and efficient scheduling mode[C]//Proceedings of the 2020 International Conference on High Performance Big Data and Intelligent Systems. Piscataway:IEEE,2020:1-7. [8] SHEN J Z,HUANG Y,WANG Z L,et al. Towards a uniform template-based architecture for accelerating 2D and 3D CNNs on FPGA[C]//Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. New York:ACM, 2018:97-106. [9] RAJAT R, ZENG H Q, PRASANNA V. A flexible design automation tool for accelerating quantized spectral CNNs[C]//Proceedings of 29th International Conference on FieldProgrammable Logic and Applications. Piscataway:IEEE,2019:144-150. [10] LIANG Y,LU L Q,XIAO Q C,et al. Evaluating fast algorithms for convolutional neural networks on FPGAs[J]. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems,2020,39(4):857-870. [11] PODILI A, ZHANG C, PRASANNA V. Fast and efficient implementation of convolutional neural networks on FPGA[C]//Proceedings of the IEEE 28th International Conference on Application-Specific Systems, Architectures and Processors. Piscataway:IEEE,2017:11-18. [12] SHEN J Z,HUANG Y,WEN M,et al. Towards an efficient deep pipelined template-based architecture for accelerating the entire 2-D and 3-D CNNs on FPGA[J]. IEEE Transactions on ComputerAided Design of Integrated Circuits and Systems,2020,39(7):1442-1455. [13] ZHANG C,LI P,SUN G Y,et al. Optimizing FPGA-based accelerator design for deep convolutional neural networks[C]//Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. New York:ACM, 2015:161-170. [14] ZHU C Y,HUANG K J,YANG S Y,et al. An efficient hardware accelerator for structured sparse convolutional neural networks on FPGAs[J]. IEEE Transactions on Very Large Scale Integration (VLSI)Systems,2020,28(9):1953-1965. [15] ZHANG C,PRASANNA V. Frequency domain acceleration of convolutional neural networks on CPU-FPGA shared memory system[C]//Proceedings of 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. New York:ACM,2017:35-44. [16] GUO K Y,SUI L Z,QIU J T,et al. Angel-Eye:a complete design flow for mapping CNN onto embedded FPGA[J]. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems,2018,37(1):35-47. [17] WINOGRAD S. Arithmetic Complexity of Computations[M]. Philadelphia, PA:Society for Industrial and Applied Mathematics,1980:18-23. |