《计算机应用》唯一官方网站 ›› 2025, Vol. 45 ›› Issue (5): 1455-1463.DOI: 10.11772/j.issn.1001-9081.2024050609

• 人工智能 • 上一篇    

基于Transformer的深度符号回归方法

许鹏程1,2, 何磊2(), 李川1, 钱炜祺2, 赵暾2   

  1. 1.四川大学 计算机学院,成都 610065
    2.空天飞行空气动力科学与技术全国重点实验室(中国空气动力研究与发展中心),四川 绵阳 621000
  • 收稿日期:2024-05-15 修回日期:2024-09-21 接受日期:2024-09-29 发布日期:2024-10-09 出版日期:2025-05-10
  • 通讯作者: 何磊
  • 作者简介:许鹏程(1999—),男,重庆人,硕士研究生,主要研究方向:深度强化学习、深度符号回归
    何磊(1988—),男,四川绵竹人,副研究员,博士,主要研究方向:机器学习、气动建模
    李川(1977—),男,河南郑州人,副教授,博士,主要研究方向:深度强化学习、数据挖掘、知识工程
    钱炜祺(1973—),男,江苏无锡人,研究员,博士生导师,博士,主要研究方向:空气动力学、气动参数辨识
    赵暾(1989—),男,甘肃西和人,副研究员,博士,主要研究方向:飞行控制、人工智能。
  • 基金资助:
    智强基金卓越人才项目

Deep symbolic regression method based on Transformer

Pengcheng XU1,2, Lei HE2(), Chuan LI1, Weiqi QIAN2, Tun ZHAO2   

  1. 1.College of Computer Science,Sichuan University,Chengdu Sichuan 610065,China
    2.State Key Laboratory of Aerodynamics (China Aerodynamics Research and Development Center),Mianyang Sichuan 621000,China
  • Received:2024-05-15 Revised:2024-09-21 Accepted:2024-09-29 Online:2024-10-09 Published:2025-05-10
  • Contact: Lei HE
  • About author:XU Pengcheng, born in 1999, M. S. candidate. His research interests include deep reinforcement learning, deep symbolic regression.
    HE Lei, born in 1988, Ph. D., associate research fellow. His research interests include machine learning, pneumatic modeling.
    LI Chuan, born in 1977, Ph. D., associate professor. His research interests include deep reinforcement learning, data mining, knowledge engineering.
    QIAN Weiqi, born in 1973, Ph. D., research fellow. His research interests include aerodynamics, aerodynamic parameter identification.
    ZHAO Tun, born in 1989, Ph. D., associate research fellow. His research interests include flight control, artificial intelligence.
  • Supported by:
    Intellectual Strength Foundation (ISF) Project for Talent Excellence

摘要:

针对利用遗传进化算法解决符号回归(SR)问题时存在的种群多样性降低以及对超参数敏感等问题,提出基于Transformer的深度符号回归(DSRT)方法。该方法在利用Transformer自回归的方式生成表达式符号序列后,将数据和表达式符号序列的拟合度值的变换值当作奖励值,再利用深度强化学习的方法更新模型参数,使模型输出的表达式序列更加拟合数据,并随着模型的不断收敛找出最优的表达式。在SR基准数据集Nguyen上对DSRT方法进行有效性测试,并在200次迭代内将它与DSR(Deep Symbolic Regression)和GP(Genetic Programming)算法进行对比,实验结果验证了DSRT方法的有效性。另外,讨论了各参数对DSRT方法的影响,并在NACA4421数据上进行飞机翼型表面压力系数公式预测实验,将所得到的公式与卡门-钱学森公式作对比,找到了均方根误差(RMSE)较小的数学公式。

关键词: 符号回归, Transformer, 深度强化学习, NACA4412, 卡门-钱学森公式

Abstract:

To address the challenges of reduced population diversity and sensitivity to hyperparameters in solving Symbolic Regression (SR) problems by using genetic evolutionary algorithms, a Deep Symbolic Regression Technique (DSRT) method based on Transformer was proposed. This method employed autoregressive capability of Transformer to generate expression symbol sequence. Subsequently, the transformation of the fitness value between the data and the expression symbol sequence was served as a reward value, and the model parameters were updated through deep reinforcement learning, so that the model was able to output expression sequence that fitted the data better, and with the model’s continuous converging, the optimal expression was identified. The effectiveness of the DSRT method was validated on the SR benchmark dataset Nguyen, and it was compared with DSR (Deep Symbolic Regression) and GP (Genetic Programming) algorithms within 200 iterations. Experimental results confirm the validity of DSRT method. Additionally, the influence of various parameters on DSRT method was discussed, and an experiment to predict the formula for surface pressure coefficient of an aircraft airfoil using NACA4421 dataset was performed. The obtained formula was compared with the Kármán-Tsien formula, yielding a mathematical formula with a lower Root Mean Square Error (RMSE).

Key words: Symbolic Regression (SR), Transformer, deep reinforcement learning, NACA4412, Kármán-Tsien formula

中图分类号: