Journal of Computer Applications ›› 2025, Vol. 45 ›› Issue (5): 1455-1463.DOI: 10.11772/j.issn.1001-9081.2024050609
• Artificial intelligence • Previous Articles
Pengcheng XU1,2, Lei HE2(), Chuan LI1, Weiqi QIAN2, Tun ZHAO2
Received:
2024-05-15
Revised:
2024-09-21
Accepted:
2024-09-29
Online:
2024-10-09
Published:
2025-05-10
Contact:
Lei HE
About author:
XU Pengcheng, born in 1999, M. S. candidate. His research interests include deep reinforcement learning, deep symbolic regression.Supported by:
通讯作者:
何磊
作者简介:
许鹏程(1999—),男,重庆人,硕士研究生,主要研究方向:深度强化学习、深度符号回归基金资助:
CLC Number:
Pengcheng XU, Lei HE, Chuan LI, Weiqi QIAN, Tun ZHAO. Deep symbolic regression method based on Transformer[J]. Journal of Computer Applications, 2025, 45(5): 1455-1463.
许鹏程, 何磊, 李川, 钱炜祺, 赵暾. 基于Transformer的深度符号回归方法[J]. 《计算机应用》唯一官方网站, 2025, 45(5): 1455-1463.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2024050609
参数 | 参数值 |
---|---|
候选符号库 | [+,-,*,/,sin,exp,log,1,f] |
前馈神经网络维度 | 128 |
编码器解码器层数 | 3 |
学习率 | 0.002 5 |
Tab. 1 Parameter setting of DSRT algorithm
参数 | 参数值 |
---|---|
候选符号库 | [+,-,*,/,sin,exp,log,1,f] |
前馈神经网络维度 | 128 |
编码器解码器层数 | 3 |
学习率 | 0.002 5 |
参数名 | 参数值 | 参数名 | 参数值 |
---|---|---|---|
初始种群 | 5 000 | 子树突变率 | 0.1 |
迭代数 | 200 | 提升突变率 | 0.05 |
交叉率 | 0.7 | 点突变率 | 0.1 |
Tab. 2 Parameter setting of GP algorithm
参数名 | 参数值 | 参数名 | 参数值 |
---|---|---|---|
初始种群 | 5 000 | 子树突变率 | 0.1 |
迭代数 | 200 | 提升突变率 | 0.05 |
交叉率 | 0.7 | 点突变率 | 0.1 |
Benchmark | 表达式 | 取值范围 | RMSE/% | ||
---|---|---|---|---|---|
DSRT | DSR | GP | |||
Nguyen-1 | U(-1,1, 20) | 0.00 | 0.00 | 0.00 | |
Nguyen-2 | 0.00 | 0.00 | 0.00 | ||
Nguyen-3 | 0.00 | 0.00 | 0.00 | ||
Nguyen-4 | 0.00 | 0.00 | 9.21 | ||
Nguyen-5 | 1.17 | 1.77 | 7.16 | ||
Nguyen-6 | 0.00 | 0.00 | 5.70 | ||
Nguyen-7 | U(0,2,20) | 3.09 | 2.38 | 13.92 | |
Nguyen-8 | U(0,4,20) | 0.00 | 0.00 | 0.00 | |
Nguyen-9 | U(-1,1,100) | 1.24 | 0.00 | 0.00 | |
Nguyen-10 | 0.00 | 0.00 | 0.00 | ||
Nguyen-11 | U(0,1, 100) | 0.00 | 1.29 | 6.22 | |
Nguyen-12 | 2.53 | 1.41 | 5.36 |
Tab. 3 Comparison test’s RMSE results of DSRT vs DSR,GP
Benchmark | 表达式 | 取值范围 | RMSE/% | ||
---|---|---|---|---|---|
DSRT | DSR | GP | |||
Nguyen-1 | U(-1,1, 20) | 0.00 | 0.00 | 0.00 | |
Nguyen-2 | 0.00 | 0.00 | 0.00 | ||
Nguyen-3 | 0.00 | 0.00 | 0.00 | ||
Nguyen-4 | 0.00 | 0.00 | 9.21 | ||
Nguyen-5 | 1.17 | 1.77 | 7.16 | ||
Nguyen-6 | 0.00 | 0.00 | 5.70 | ||
Nguyen-7 | U(0,2,20) | 3.09 | 2.38 | 13.92 | |
Nguyen-8 | U(0,4,20) | 0.00 | 0.00 | 0.00 | |
Nguyen-9 | U(-1,1,100) | 1.24 | 0.00 | 0.00 | |
Nguyen-10 | 0.00 | 0.00 | 0.00 | ||
Nguyen-11 | U(0,1, 100) | 0.00 | 1.29 | 6.22 | |
Nguyen-12 | 2.53 | 1.41 | 5.36 |
编码器解码器层数 | 最小RMSE出现时间/s | 出现 轮数 | 最小MSE/% | 数学公式 |
---|---|---|---|---|
1 | 1 096.13 | 15 | 2.52 | |
2 | 2 415.91 | 29 | 3.86 | |
3 | 803.15 | 11 | 3.96 | |
4 | 696.76 | 9 | 4.07 | |
7 | 539.50 | 7 | 3.96 |
Tab. 4 Influence of encoder and decoder layers on results
编码器解码器层数 | 最小RMSE出现时间/s | 出现 轮数 | 最小MSE/% | 数学公式 |
---|---|---|---|---|
1 | 1 096.13 | 15 | 2.52 | |
2 | 2 415.91 | 29 | 3.86 | |
3 | 803.15 | 11 | 3.96 | |
4 | 696.76 | 9 | 4.07 | |
7 | 539.50 | 7 | 3.96 |
前馈神经网络维度 | 最小RMSE出现时间/s | 出现轮数 | 最小RMSE/% | 数学公式 |
---|---|---|---|---|
16 | 2 037.09 | 26 | 3.91 | |
32 | 1 654.32 | 22 | 3.92 | |
64 | 459.43 | 6 | 3.75 | |
128 | 2 085.66 | 21 | 3.00 | |
256 | 1 593.14 | 21 | 3.06 | |
512 | 2 098.43 | 27 | 2.58 | |
1 024 | 1 993.78 | 25 | 3.12 | |
2 048 | 1 937.79 | 26 | 3.85 |
Tab. 5 Influence of feedforward neural network dimensionality on results
前馈神经网络维度 | 最小RMSE出现时间/s | 出现轮数 | 最小RMSE/% | 数学公式 |
---|---|---|---|---|
16 | 2 037.09 | 26 | 3.91 | |
32 | 1 654.32 | 22 | 3.92 | |
64 | 459.43 | 6 | 3.75 | |
128 | 2 085.66 | 21 | 3.00 | |
256 | 1 593.14 | 21 | 3.06 | |
512 | 2 098.43 | 27 | 2.58 | |
1 024 | 1 993.78 | 25 | 3.12 | |
2 048 | 1 937.79 | 26 | 3.85 |
学习率 | 最小RMSE出现时间/s | 出现轮数 | 最小RMSE/% | 数学公式 |
---|---|---|---|---|
0.000 1 | 638.73 | 9 | 3.84 | |
0.000 5 | 439.74 | 5 | 3.87 | |
0.001 0 | 1 914.95 | 26 | 3.86 | |
0.002 0 | 115.03 | 8 | 3.80 | |
0.002 5 | 1 936.37 | 17 | 3.35 | |
0.005 0 | 2 041.99 | 28 | 3.86 | |
0.010 0 | 827.10 | 10 | 3.87 |
Tab. 6 Influence of learning rate on results
学习率 | 最小RMSE出现时间/s | 出现轮数 | 最小RMSE/% | 数学公式 |
---|---|---|---|---|
0.000 1 | 638.73 | 9 | 3.84 | |
0.000 5 | 439.74 | 5 | 3.87 | |
0.001 0 | 1 914.95 | 26 | 3.86 | |
0.002 0 | 115.03 | 8 | 3.80 | |
0.002 5 | 1 936.37 | 17 | 3.35 | |
0.005 0 | 2 041.99 | 28 | 3.86 | |
0.010 0 | 827.10 | 10 | 3.87 |
0.069 282 | -0.626 16 | 0.350 32 | -0.689 80 |
0.098 867 | -0.601 15 | 0.484 85 | -0.790 64 |
0.188 260 | -0.636 57 | 0.531 01 | -0.793 30 |
0.220 030 | -0.622 63 | 0.569 55 | -0.816 01 |
0.226 720 | -0.664 31 | 0.603 83 | -0.801 06 |
0.258 940 | -0.626 75 | 0.587 74 | -0.893 89 |
0.292 330 | -0.667 03 | 0.629 72 | -0.853 86 |
0.331 560 | -0.647 07 |
Tab. 7 NACA4412 real data
0.069 282 | -0.626 16 | 0.350 32 | -0.689 80 |
0.098 867 | -0.601 15 | 0.484 85 | -0.790 64 |
0.188 260 | -0.636 57 | 0.531 01 | -0.793 30 |
0.220 030 | -0.622 63 | 0.569 55 | -0.816 01 |
0.226 720 | -0.664 31 | 0.603 83 | -0.801 06 |
0.258 940 | -0.626 75 | 0.587 74 | -0.893 89 |
0.292 330 | -0.667 03 | 0.629 72 | -0.853 86 |
0.331 560 | -0.647 07 |
参数 | 参数值 |
---|---|
候选符号库 | [+,-,*,/,sin,cos,exp,log,1,f] |
前馈神经网络维度 | 256 |
编码器解码器层数 | 1 |
学习率 | 0.002 5 |
Tab. 8 Parameter setting of DSRT algorithm in NACA4421 experiments
参数 | 参数值 |
---|---|
候选符号库 | [+,-,*,/,sin,cos,exp,log,1,f] |
前馈神经网络维度 | 256 |
编码器解码器层数 | 1 |
学习率 | 0.002 5 |
序号 | 公式 | RMSE/10-2 | |
---|---|---|---|
DSRT | 卡门-钱学森公式 | ||
1 | 1.19 | 4.62 | |
2 | 1.15 | 3.74 | |
3 | 2.19 | 3.23 |
Tab. 9 Situations of interpolation
序号 | 公式 | RMSE/10-2 | |
---|---|---|---|
DSRT | 卡门-钱学森公式 | ||
1 | 1.19 | 4.62 | |
2 | 1.15 | 3.74 | |
3 | 2.19 | 3.23 |
1 | KRIZHEVSKY A, SUTSKEVER I, HINTON E G. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6):84-90. |
2 | HE K, ZHANG X, REN S, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9):1904-1916. |
3 | THIRUNAVUKARASU A J, TING D S J, ELANGOVAN K, et al. Large language models in medicine[J]. Nature Medicine, 2023, 29(8): 1930-1940. |
4 | WU L, LI T, WANG L, et al. Improving hybrid CTC/Attention architecture with time-restricted self-attention CTC for end-to-end speech recognition[J]. Applied Sciences, 2019, 9(21): No.4639. |
5 | ZHOU G, ZHU X, SONG C, et al. Deep interest network for click-through rate prediction[C]// Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2018: 1059-1068. |
6 | WANG H, FU T, DU Y, et al. Scientific discovery in the age of artificial intelligence[J]. Nature, 2023, 620(7972):47-60. |
7 | LU Q, REN J, WANG Z. Using genetic programming with prior formula knowledge to solve symbolic regression problem[J]. Computational Intelligence and Neuroscience, 2016, 2016: No.1021378. |
8 | KOZA J R. Genetic programming: a paradigm for genetically breeding populations of computer programs to solve problems[M]. Cambridge: MIT Press, 1992:12-20. |
9 | VIRGOLIN M, ALDERLIESTEN T, WITTEVEEN C, et al. Improving model-based genetic programming for symbolic regression of small expressions[J]. Evolutionary Computation, 2020, 29(2): 211-237. |
10 | ZHOU J, FENG L, CAI W, et al. Multifactorial genetic programming for symbolic regression problems[J]. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2020, 50(11): 4492-4505. |
11 | AL-HELALI B, CHEN Q, XUE B, et al. Multitree genetic programming with new operators for transfer learning in symbolic regression with incomplete data[J]. IEEE Transactions on Evolutionary Computation, 2021, 25(6): 1049-1063. |
12 | VLADISLAVLEVA E J, SMITS G F, DEN HERTOG D. Order of nonlinearity as a complexity measure for models generated by symbolic regression via Pareto genetic programming[J]. IEEE Transactions on Evolutionary Computation, 2009, 13(2): 333-349. |
13 | DUBČÁKOVÁ R. Eureqa: software review[J]. Genetic Programming and Evolvable Machines, 2011, 12(2):173-178. |
14 | UDRESCU S M, TAN A, FENG J, et al. AI Feynman 2.0: Pareto-optimal symbolic regression exploiting graph modularity[C]// Proceedings of the 34th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2020: 4860-4871. |
15 | 鲁强,张洋. 基于蒙特卡洛树搜索的符号回归算法[J]. 计算机工程与设计, 2020, 41(8):2158-2164. |
LU Q, ZHANG Y. Solving symbol regression based on Monte Carlo tree search[J]. Computer Engineering and Design, 2020, 41(8): 2158-2164. | |
16 | ALAA A M, VAN DER SCHAAR M. Demystifying black-box models with symbolic metamodels[C]// Proceedings of the 33rd International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2019:11304-11314. |
17 | CRANMER M, SANCHEZ-GONZALEZ A, BATTAGLIA P, et al. Discovering symbolic models from deep learning with inductive biases[C]// Proceedings of the 34th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2020:17429-17442. |
18 | PETERSEN B K, LANDAJUELA M, MUNDHENK T N, et al. Deep symbolic regression: recovering mathematical expressions from data via risk-seeking policy gradients[EB/OL]. [2024-04-05].. |
19 | MUNDHENK T N, LANDAJUELA M, GLATT R, et al. Symbolic regression via neural-guided genetic programming population seeding[C]// Proceedings of the 35th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2021: 24912-24923. |
20 | KAMIENNY P A, D'ASCOLI S, LAMPLE G, et al. End-to-end symbolic regression with Transformers[C]// Proceedings of the 36th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2022:10269-10281. |
21 | VASTL M, KULHÁNEK J, KUBALÍK J, et al. SymFormer: end-to-end symbolic regression using Transformer-based architecture[J]. IEEE Access, 2024, 12: 37840-37849. |
22 | LI W, LI W, SUN L, et al. Transformer-based model for symbolic regression via joint supervised learning[EB/OL]. [2024-08-22].. |
23 | BIGGIO L, BENDINELLI T, NEITZ A, et al. Neural symbolic regression that scales[C]// Proceedings of the 38th International Conference on Machine Learning. New York: JMLR.org, 2021: 936-945. |
24 | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2017: 6000-6010. |
25 | ELMAN J L. Finding structure in time[J]. Cognitive Science, 1990, 14(2): 179-211. |
26 | LeCUN Y, BOTTOU L, BENGIO Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324. |
27 | CHAMBERS L G. Review: Practical methods of optimization (2nd ed.)[J].The Mathematical Gazette, 2001, 85(504): 562-563. |
28 | PETERSEN B K, SANTIAGO C P, LANDAJUELA M. Incorporating domain knowledge into neural-guided search[EB/OL]. [2024-08-02].. |
29 | WILLIAMS R J. Simple statistical gradient-following algorithms for connectionist reinforcement learning[J]. Machine Learning, 1992, 8(3/4): 229-256. |
[1] | Hui LI, Bingzhi JIA, Chenxi WANG, Ziyu DONG, Jilong LI, Zhaoman ZHONG, Yanyan CHEN. Generative adversarial network underwater image enhancement model based on Swin Transformer [J]. Journal of Computer Applications, 2025, 45(5): 1439-1446. |
[2] | Pengyu CHEN, Xiushan NIE, Nanjun LI, Tuo LI. Semi-supervised video object segmentation method based on spatio-temporal decoupling and regional robustness enhancement [J]. Journal of Computer Applications, 2025, 45(5): 1379-1386. |
[3] | Dingmu YANG, Longqiang NI, Jing LIANG, Zhaoyuan QIU, Yongzhen ZHANG, Zhiqiang QI. Protocol conversion method based on semantic similarity [J]. Journal of Computer Applications, 2025, 45(4): 1263-1270. |
[4] | Baohua YUAN, Jialu CHEN, Huan WANG. Medical image segmentation network integrating multi-scale semantics and parallel double-branch [J]. Journal of Computer Applications, 2025, 45(3): 988-995. |
[5] | Jing WANG, Xuming FANG. Intelligent joint power and channel allocation algorithm for Wi-Fi7 multi-link integrated communication and sensing [J]. Journal of Computer Applications, 2025, 45(2): 563-570. |
[6] | Yalun WANG, Yangsen ZHANG, Siwen ZHU. Headline generation model with position embedding for knowledge reasoning [J]. Journal of Computer Applications, 2025, 45(2): 345-353. |
[7] | Huahua WANG, Liang HUANG, Jiajie CHEN, Jiening FANG. Dynamic allocation algorithm for multi-beam subcarriers of low orbit satellites based on deep reinforcement learning [J]. Journal of Computer Applications, 2025, 45(2): 571-577. |
[8] | Jietao LIANG, Bing LUO, Lanhui FU, Qingling CHANG, Nannan LI, Ningbo YI, Qi FENG, Xin HE, Fuqin DENG. Point cloud registration method based on coordinate geometric sampling [J]. Journal of Computer Applications, 2025, 45(1): 214-222. |
[9] | Zijun MIAO, Fei LUO, Weichao DING, Wenbo DONG. Traffic signal control algorithm based on overall state prediction and fair experience replay [J]. Journal of Computer Applications, 2025, 45(1): 337-344. |
[10] | Yunchuan HUANG, Yongquan JIANG, Juntao HUANG, Yan YANG. Molecular toxicity prediction based on meta graph isomorphism network [J]. Journal of Computer Applications, 2024, 44(9): 2964-2969. |
[11] | Xin YANG, Xueni CHEN, Chunjiang WU, Shijie ZHOU. Short-term traffic flow prediction of urban highway based on variant residual model and Transformer [J]. Journal of Computer Applications, 2024, 44(9): 2947-2951. |
[12] | Jinjin LI, Guoming SANG, Yijia ZHANG. Multi-domain fake news detection model enhanced by APK-CNN and Transformer [J]. Journal of Computer Applications, 2024, 44(9): 2674-2682. |
[13] | Jieru JIA, Jianchao YANG, Shuorui ZHANG, Tao YAN, Bin CHEN. Unsupervised person re-identification based on self-distilled vision Transformer [J]. Journal of Computer Applications, 2024, 44(9): 2893-2902. |
[14] | Liehong REN, Lyuwen HUANG, Xu TIAN, Fei DUAN. Multivariate long-term series forecasting method with DFT-based frequency-sensitive dual-branch Transformer [J]. Journal of Computer Applications, 2024, 44(9): 2739-2746. |
[15] | Jiepo FANG, Chongben TAO. Hybrid internet of vehicles intrusion detection system for zero-day attacks [J]. Journal of Computer Applications, 2024, 44(9): 2763-2769. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||