《计算机应用》唯一官方网站

• •    下一篇

基于图形重写和融合探索的张量虚拟机算符融合优化

王娜1,蒋林1,李远成1,朱筠2   

  1. 1.西安科技大学 计算机科学与技术 2.西安邮电大学 电子工程学院
  • 收稿日期:2023-09-12 修回日期:2023-11-11 发布日期:2024-03-15 出版日期:2024-03-15
  • 通讯作者: 蒋林
  • 作者简介:王娜(1994—),女,陕西渭南人,硕士研究生,主要研究方向:可重构编译优化、深度学习;蒋林(1970—),男,陕西杨凌人,教授,博士,博士生导师,主要研究方向:专用集成电路设计、计算机体系结构、计算机图形处理;李远成(1981—),男,河南开封人,讲师,博士,硕士生导师, CCF会员,主要研究方向:计算机体系结构、并行计算、人工智能;朱筠(1981—),女,陕西西安人,讲师,硕士,主要研究方向:集成电路设计及仿真。
  • 基金资助:
    科技创新2030——“新一代人工智能”重大项目(2022ZD0119005);国家自然科学基金资助项目(61834005);陕西省自然科学基金项目(2020JM-525)

Optimization of tensor virtual machine operator fusion based on graphic rewriting and fusion exploration

WANG Na1, JIANG Lin1, LI Yuancheng1, ZHU Yun2   

  1. 1. College of Computer Science and Technology, Xi’an University of Science and Technology 2. School of Electronic Engineering, Xi’an University of Posts and Telecommunications
  • Received:2023-09-12 Revised:2023-11-11 Online:2024-03-15 Published:2024-03-15
  • About author:WANG Na, born in 1994, M.S. candidate. Her research interests include reconfigurable compiler optimization, deep learning JIANG Lin, born in 1970, Ph. D., professor. His research interests include application specific integrated circuit design, computer architecture, computer graphics and image processing. LI Yuancheng, born in 1981, Ph. D., lecturer. His research interests include high performance computer architecture, parallel computing, artificial intelligence. ZHU Yun, born in 1981, M.S., lecturer. Her research interests include integrated circuit design and simulation.
  • Supported by:
    Scientific and Technological Innovation 2030 — Major Project of New Generation Artificial Intelligence (2022ZD0119005), National Natural Science Foundation of China (61834005), Shaanxi Natural Science Foundation Project (2020JM-525)

摘要: 针对计算密集型神经网络在使用张量虚拟机(TVM)算符融合过程中对计算图进行逐层查找导致访问次数过多、内存资源利用率低等问题,提出一种基于图形重写和融合探索的TVM算符融合优化方法。首先,对运算符的映射类型进行分析;其次,基于运算定律对计算图进行重写,简化计算图结构以减少中间结果生成,降低内存资源消耗并提升融合效率;再次,采用融合探索算法寻找融合代价较小的算符优先进行融合,避免数据冗余和寄存器溢出;最后,在CPU上实现神经网络算符融合,并测试融合加速性能。实验结果表明,所提方法可有效减少计算图层数和算符个数,降低访存频率和数据传输量。与TVM算符融合方法相比,融合过程中计算图层数平均减少18%,融合速度平均提升23%,验证了该方法在优化计算图融合过程中的有效性。

关键词: 算符融合, 图形重写, 张量虚拟机, 神经网络, 融合探索

Abstract: In addressing issues encountered in the process of operator fusion for compute-intensive neural networks using the Tensor Virtual Machine (TVM), such as excessive access counts and low memory resource utilization during layer-wise exploration of computational graphs, an optimization method for TVM operator fusion based on graph rewriting and fusion exploration is proposed. Initially, an analysis was conducted on the mapping types of operators. Subsequently, the computational graph was rewritten based on operation laws to simplify its structure, reducing the generation of intermediate results, lowering memory resource consumption, and enhancing fusion efficiency. Following that, a fusion exploration algorithm was employed to identify operators with lower fusion costs for prioritized fusion, thereby mitigating data redundancy and register overflow. Finally, neural network operator fusion was implemented on the CPU, and the fusion acceleration performance was tested. Experimental results indicated that the proposed method effectively reduced the number of computational graph layers and operators, decreased access frequency, and reduced data transfer volume. In comparison with the TVM operator fusion method, there was an average reduction of 18% in computational graph layers during the fusion process, and the fusion speed was increased by an average of 23%, confirming the effectiveness of the method in optimizing the computational graph fusion process.

Key words: operator fusion, graphic rewriting, tensor virtual machine, neural network, fusion exploration

中图分类号: