基于ARMv8架构的面向机器翻译的单精度浮点通用矩阵乘法优化
龚鸣清, 叶煌, 张鉴, 卢兴敬, 陈伟
Single precision floating general matrix multiply optimization for machine translation based on ARMv8 architecture
GONG Mingqing, YE Huang, ZHANG Jian, LU Xingjing, CHEN Wei
计算机应用 . 2019, (6): 1557 -1562 .  DOI: 10.11772/j.issn.1001-9081.2018122608