计算机应用 ›› 2016, Vol. 36 ›› Issue (8): 2213-2218.DOI: 10.11772/j.issn.1001-9081.2016.08.2213

• 先进计算 • 上一篇    下一篇

M-DSP中高性能浮点乘加器的设计与实现

车文博, 刘衡竹, 田甜   

  1. 国防科学技术大学 计算机学院, 长沙 410073
  • 收稿日期:2016-01-15 修回日期:2016-03-12 出版日期:2016-08-10 发布日期:2016-08-10
  • 通讯作者: 田甜
  • 作者简介:车文博(1981-),男,山东武城人,硕士研究生,主要研究方向:微处理器设计;刘衡竹(1963-),男,湖南怀化人,教授,博士,主要研究方向:微处理器设计、计算机体系结构;田甜(1983-),男,湖南澧县人,硕士,主要研究方向:微处理器设计。
  • 基金资助:
    航天科学基金资助项目(2013ZC88003)。

Design and implementation of high performance floating-point multiply acculate for M-DSP

CHE Wenbo, LIU Hengzhu, TIAN Tian   

  1. College of Computer, National University of Defense Technology, Changsha Hunan 410073, China
  • Received:2016-01-15 Revised:2016-03-12 Online:2016-08-10 Published:2016-08-10
  • Supported by:
    This work is partially supported by the the Aerospace Science Foundation of China (2013ZC88003).

摘要: 针对高性能M型数字信号处理器(M-DSP)对浮点运算的性能、面积和功耗要求,研究分析了M-DSP总体结构和浮点运算的指令特点,设计和实现了一种高性能低功耗的浮点乘累加器(FMAC)。该乘加器采用单、双精度通路分离的主体结构,分为六级流水站执行,对乘法器、对阶移位等关键模块进行了复用设计,支持双精度和单精度浮点乘法、乘累加、乘累减、单精度点积和复数运算。对所设计的乘加器进行了全面的验证,基于45nm工艺采用Synopsys公司的Design Compiler工具综合所设计的代码,综合结果表明运行频率可达1GHz,单元面积36856μm2;与FT-XDSP中的乘加器相比,面积节省了12.95%,关键路径长度减少了2.17%。

关键词: 浮点乘法, 浮点乘累加器, 浮点点积, 布斯算法, IEEE

Abstract: In order to meet the requirements on performance, power, area of floating-point computing in M-DSP, the architecture of a M-DSP, as well as the characteristics of all the instructions related to its floating-point computing were analyzed, and a Floating-point Multiply ACcumulate (FMAC) with high performance and low power was proposed. The proposed FMAC has structure with separated single and double precision path, which was divided into 6-stage pipelines; its key modules including multiplier and shift device were designed for reuse, and the operations including single and double precision floating-point multiplication, multiply-add and multiply-sub, floating-point complex multiplication, dot product, etc. were all implemented in it. The proposed FMAC was fully verified and synthesized by using Design Compiler with 45nm technique of Synopsys Company. Experimental results show that the frequency of the proposed FMAC is up to 1GHz, the area is 36856μm2; compared with the FMAC of FT-XDSP, the area is saved by 12.95%, and the critical path was shortened by 2.17%.

Key words: floating-point multiplier, Floating-point Multiply ACcumulate(FMAC), floating-point dot product, Booth algorithm, IEEE

中图分类号: