《计算机应用》唯一官方网站 ›› 2024, Vol. 44 ›› Issue (10): 3158-3166.DOI: 10.11772/j.issn.1001-9081.2023101427
收稿日期:2023-10-23
修回日期:2024-01-11
接受日期:2024-01-18
发布日期:2024-10-15
出版日期:2024-10-10
通讯作者:
白宇
作者简介:孙凯明(2000—),男,河北石家庄人,硕士研究生,主要研究方向:自然语言处理
Kaiming SUN, Dongfeng CAI, Yu BAI(
)
Received:2023-10-23
Revised:2024-01-11
Accepted:2024-01-18
Online:2024-10-15
Published:2024-10-10
Contact:
Yu BAI
About author:SUN Kaiming, born in 2000, M. S. candidate. His research interests include natural language processing.摘要:
针对现有符号回归方法难以有效泛化至整数数列在线百科全书(OEIS)中数列的问题,提出一种基于自学习(SL)的整数数列符号回归方法。首先,通过程序构造多种学习数据,结合OEIS数据的特点融入高阶线性递推数据,并采用OEIS初始项生成递推数列;其次,将学习数据转换为OEIS数据,提出融合多种OEIS数据作为初始迭代数据的策略;最后,通过自学习迭代逐步发现OEIS数列的公式,迭代过程分为学习、搜索、检验、选择这4个阶段。实验结果表明,所提方法优于深度符号回归(DSR)方法和Mathematica内置函数,在Easy、Sign和Base这3个测试集上相较于DSR的准确率分别提升9.66、4.17和5.14个百分点,共发现27 433个OEIS数列的公式,其中新发现的公式可以辅助数学家研究相关理论。
中图分类号:
孙凯明, 蔡东风, 白宇. 基于自学习的整数数列符号回归方法[J]. 计算机应用, 2024, 44(10): 3158-3166.
Kaiming SUN, Dongfeng CAI, Yu BAI. Symbolic regression method for integer sequence based on self-learning[J]. Journal of Computer Applications, 2024, 44(10): 3158-3166.
| 公式类型 | 初始项 | ||||
|---|---|---|---|---|---|
| 普通递推 | 10 | 6 | 5 | 30 | [-10,10] |
| 一般高阶递推 | 10 | 12 | 13 | 30 | OEIS |
| 高阶线性递推 | 2 | 12 | 13 | 30 | OEIS |
| 高阶常系数线性递推 | 10 | 12 | 13 | 30 | OEIS |
表1 生成公式的超参数
Tab. 1 Hyperparameters for generating formulas
| 公式类型 | 初始项 | ||||
|---|---|---|---|---|---|
| 普通递推 | 10 | 6 | 5 | 30 | [-10,10] |
| 一般高阶递推 | 10 | 12 | 13 | 30 | OEIS |
| 高阶线性递推 | 2 | 12 | 13 | 30 | OEIS |
| 高阶常系数线性递推 | 10 | 12 | 13 | 30 | OEIS |
| 类型 | 运算符 | 符号 | 释义 | 定义式 |
|---|---|---|---|---|
二元 运算符 | add | + | 加法 | |
| sub | - | 减法 | ||
| mul | * | 乘法 | ||
| intdiv | // | 整数除法 | ||
| mod | % | 取余 | ||
一元 运算符 | abs | | | | 绝对值 | |
| sqr | ** | 平方 | ||
| sign | sign | 取符号 | ||
| ReLU | ReLU | 非线性运算 |
表2 公式运算符种类
Tab. 2 Types of formula operators
| 类型 | 运算符 | 符号 | 释义 | 定义式 |
|---|---|---|---|---|
二元 运算符 | add | + | 加法 | |
| sub | - | 减法 | ||
| mul | * | 乘法 | ||
| intdiv | // | 整数除法 | ||
| mod | % | 取余 | ||
一元 运算符 | abs | | | | 绝对值 | |
| sqr | ** | 平方 | ||
| sign | sign | 取符号 | ||
| ReLU | ReLU | 非线性运算 |
| 训练集 | 公式类型 | 训练集大小/106 |
|---|---|---|
| ITSA | 普通递推 | 600 |
| ITSB | 一般高阶递推 | 25 |
| ITSC | 高阶线性递推 | 10 |
| ITSD | 常系数高阶线性递推 | 10 |
表3 不同类型的训练集的大小分布
Tab. 3 Size distribution of different types of train sets
| 训练集 | 公式类型 | 训练集大小/106 |
|---|---|---|
| ITSA | 普通递推 | 600 |
| ITSB | 一般高阶递推 | 25 |
| ITSC | 高阶线性递推 | 10 |
| ITSD | 常系数高阶线性递推 | 10 |
| 阶段 | 超参数 | 配置 |
|---|---|---|
| Learn | Arch | Transformer |
| Optimizer | Adam | |
| Lr-scheduler | Inverse_sqrt | |
| Lr | 0.000 4 | |
| Clip-norm | 0.1 | |
| Dropout | 0.2 | |
| Warmup-updates | 1 000 | |
| Warmup-init-lr | 1×10-7 | |
| Batch-size | 256 | |
| Search | BeamSize | 32 |
| NBest | 32 | |
| Max-tokens | 4 096 |
表4 模型参数配置
Tab. 4 Model parameters setting
| 阶段 | 超参数 | 配置 |
|---|---|---|
| Learn | Arch | Transformer |
| Optimizer | Adam | |
| Lr-scheduler | Inverse_sqrt | |
| Lr | 0.000 4 | |
| Clip-norm | 0.1 | |
| Dropout | 0.2 | |
| Warmup-updates | 1 000 | |
| Warmup-init-lr | 1×10-7 | |
| Batch-size | 256 | |
| Search | BeamSize | 32 |
| NBest | 32 | |
| Max-tokens | 4 096 |
图5 已发现公式的OEIS数列(SeqNum)和新发现公式的OEIS数列(NewSeqNum)的变化情况
Fig. 5 Changes in number of OEIS sequence of discovered formulas (SeqNum) and number of OEIS sequence of newly discovered formulas (NewSeqNum)
| 方法 | npred | 准确率/% | ||
|---|---|---|---|---|
| Easy | Sign | Base | ||
| FSF | 1 | 14.14 | 7.97 | 1.52 |
| 10 | 13.90 | 6.98 | 1.34 | |
| all | 13.63 | 6.44 | 0.91 | |
| FLR | 1 | 21.88 | 15.15 | 11.21 |
| 10 | 20.86 | 13.67 | 7.80 | |
| all | 19.76 | 12.96 | 5.55 | |
| DSR[ | 1 | 30.20 | 23.96 | 12.54 |
| 10 | 20.42 | 12.44 | 3.96 | |
| all | 19.16 | 9.26 | 1.73 | |
| SL-A2 | 1 | 29.91 | 20.49 | 5.86 |
| 10 | 22.97 | 9.19 | 1.29 | |
| all | 22.86 | 9.00 | 1.11 | |
| SL-A1 | 1 | 29.52 | 19.95 | 5.73 |
| 10 | 24.29 | 10.44 | 1.65 | |
| all | 24.23 | 10.40 | 1.51 | |
| SL-All | 1 | 33.54 | 24.56 | 12.19 |
| 10 | 29.01 | 13.70 | 7.12 | |
| all | 28.82 | 13.43 | 6.87 | |
表5 不同方法在3个测试集上的准确率
Tab. 5 Accuracies of different methods on three test sets
| 方法 | npred | 准确率/% | ||
|---|---|---|---|---|
| Easy | Sign | Base | ||
| FSF | 1 | 14.14 | 7.97 | 1.52 |
| 10 | 13.90 | 6.98 | 1.34 | |
| all | 13.63 | 6.44 | 0.91 | |
| FLR | 1 | 21.88 | 15.15 | 11.21 |
| 10 | 20.86 | 13.67 | 7.80 | |
| all | 19.76 | 12.96 | 5.55 | |
| DSR[ | 1 | 30.20 | 23.96 | 12.54 |
| 10 | 20.42 | 12.44 | 3.96 | |
| all | 19.16 | 9.26 | 1.73 | |
| SL-A2 | 1 | 29.91 | 20.49 | 5.86 |
| 10 | 22.97 | 9.19 | 1.29 | |
| all | 22.86 | 9.00 | 1.11 | |
| SL-A1 | 1 | 29.52 | 19.95 | 5.73 |
| 10 | 24.29 | 10.44 | 1.65 | |
| all | 24.23 | 10.40 | 1.51 | |
| SL-All | 1 | 33.54 | 24.56 | 12.19 |
| 10 | 29.01 | 13.70 | 7.12 | |
| all | 28.82 | 13.43 | 6.87 | |
| Number | 方法 | Easy | Sign | Base |
|---|---|---|---|---|
| N1 | 5M+6d+FIR | 10.46 | 6.10 | 0.62 |
| N2 | 5M+12d+FIR | 10.84 | 5.68 | 0.61 |
| N3 | 5M+6d+FIO | 14.23 | 5.56 | 0.84 |
| N4 | 5M+12d+FIO | 17.61 | 6.97 | 1.34 |
| N5 | 5M+2M+12d+FIO | 18.55 | 8.65 | 1.47 |
表6 不同超参数对准确率的影响 (%)
Tab. 6 Impact of different hyperparameters on accuracy
| Number | 方法 | Easy | Sign | Base |
|---|---|---|---|---|
| N1 | 5M+6d+FIR | 10.46 | 6.10 | 0.62 |
| N2 | 5M+12d+FIR | 10.84 | 5.68 | 0.61 |
| N3 | 5M+6d+FIO | 14.23 | 5.56 | 0.84 |
| N4 | 5M+12d+FIO | 17.61 | 6.97 | 1.34 |
| N5 | 5M+2M+12d+FIO | 18.55 | 8.65 | 1.47 |
| OEIS序号 | 本文发现的公式 | 数列部分项 |
|---|---|---|
| A036146 | 1,2,4,8,16 | |
| A000037 | 2,3,5,6,7,8 | |
| A214993 | 11,121,1 341 | |
| A080206 | 1,1,1,0,2 | |
| A070602 | 0,1,14,9,16 | |
| A147657 | 1,2,1,-2,-2 | |
| A301697 | 1,5,10,16 |
表7 发现的一些有趣的OEIS数列公式
Tab. 7 Some discovered interesting OEIS sequence formulas
| OEIS序号 | 本文发现的公式 | 数列部分项 |
|---|---|---|
| A036146 | 1,2,4,8,16 | |
| A000037 | 2,3,5,6,7,8 | |
| A214993 | 11,121,1 341 | |
| A080206 | 1,1,1,0,2 | |
| A070602 | 0,1,14,9,16 | |
| A147657 | 1,2,1,-2,-2 | |
| A301697 | 1,5,10,16 |
| OEIS序号 | 本文发现的公式 | OEIS原公式 | 数列部分项 |
|---|---|---|---|
| A212342 | 1, 1, 2, 5, 9, 14, 20, 27, 35 | ||
| A006257 | 0, 1, 1, 3, 1, 3, 5, 7, 1, 3 | ||
| A350520 | 1, 1, 3, 8, 14, 32, 60, 128 | ||
| A099197 | 0, 1, 20, 201, 1 360, 7 001 | ||
| A273331 | 1, 5, 17, 69, 281, 1 129 | ||
| A264041 | 1, 3, 6, 10, 16, 21, 29, 36 | ||
| A268866 | 2, 3, 22, 38, 342, 598 |
表8 发现与OEIS中不同的数列公式
Tab. 8 Discovered different sequence formulas from those in OEIS
| OEIS序号 | 本文发现的公式 | OEIS原公式 | 数列部分项 |
|---|---|---|---|
| A212342 | 1, 1, 2, 5, 9, 14, 20, 27, 35 | ||
| A006257 | 0, 1, 1, 3, 1, 3, 5, 7, 1, 3 | ||
| A350520 | 1, 1, 3, 8, 14, 32, 60, 128 | ||
| A099197 | 0, 1, 20, 201, 1 360, 7 001 | ||
| A273331 | 1, 5, 17, 69, 281, 1 129 | ||
| A264041 | 1, 3, 6, 10, 16, 21, 29, 36 | ||
| A268866 | 2, 3, 22, 38, 342, 598 |
| OEIS序号 | 数列描述 | 本文发现的新公式 | 数列部分项 |
|---|---|---|---|
| A134342 | 某自适应自动机接受的输入组成的数列 | 0,2,5,9,15,24,38,59,90 | |
| A352178 | 集合S中含有任意n个不同的整数, | 0,1,3,4,6,7,9,11,13,15 | |
| A302930 | 在有n个地雷的无限扫雷网格中,可放入“6”的最大数, 要求每个“6”都正好与6个地雷相邻 | 0,0,0,0,0,1,1,2,2,3,3 | |
| A213685 | 最小尺寸的最大反链 | 1,3,6,9,12,17,22,28,33 | |
| A229803 | 三角形六边形单元格棋盘上的车图HR(n)的支配数, 车可以沿着相邻单元格的任何一行移动 | 1,1,2,2,3,3,3,4,4,5,5 | |
| A157795 | 与有关密度Hales-Jewett定理的超乐观猜想有关 | 1,2,4,6,9,12,15,18,22 |
表9 新发现的OEIS数列公式
Tab. 9 Newly discovered OEIS sequence formulas
| OEIS序号 | 数列描述 | 本文发现的新公式 | 数列部分项 |
|---|---|---|---|
| A134342 | 某自适应自动机接受的输入组成的数列 | 0,2,5,9,15,24,38,59,90 | |
| A352178 | 集合S中含有任意n个不同的整数, | 0,1,3,4,6,7,9,11,13,15 | |
| A302930 | 在有n个地雷的无限扫雷网格中,可放入“6”的最大数, 要求每个“6”都正好与6个地雷相邻 | 0,0,0,0,0,1,1,2,2,3,3 | |
| A213685 | 最小尺寸的最大反链 | 1,3,6,9,12,17,22,28,33 | |
| A229803 | 三角形六边形单元格棋盘上的车图HR(n)的支配数, 车可以沿着相邻单元格的任何一行移动 | 1,1,2,2,3,3,3,4,4,5,5 | |
| A157795 | 与有关密度Hales-Jewett定理的超乐观猜想有关 | 1,2,4,6,9,12,15,18,22 |
| 1 | POLU S, SUTSKEVER I. Generative language modeling for automated theorem proving [EB/OL]. (2020-09-08) [2023-10-18]. . |
| 2 | YANG K, SWOPE A M, GU A, et al. LeanDojo: theorem proving with retrieval-augmented language models [EB/OL]. (2023-06-27) [2023-10-18]. . |
| 3 | RAAYONI G, GOTTLIEB S, MANOR Y, et al. Generating conjectures on fundamental constants with the Ramanujan machine [J]. Nature, 2021, 590(7844): 67-73. |
| 4 | DAVIES A, VELIČKOVIĆ P, BUESING L, et al. Advancing mathematics by guiding human intuition with AI [J]. Nature, 2021, 600(7887): 70-74. |
| 5 | GAUTHIER T, URBAN J. Learning program synthesis for integer sequences from scratch [C]// Proceedings of the 37th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2023: 7670-7677. |
| 6 | SLOANE N J A. The on-line encyclopedia of integer sequences [DB/OL]. [2023-04-15]. . |
| 7 | D’ASCOLI S, P-A KAMIENNY, LAMPLE G, et al. Deep symbolic regression for recurrence prediction [C]// Proceedings of the 39th International Conference on Machine Learning. New York: ML Research Press, 2022: 4520-4536. |
| 8 | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need [C]// Proceedings of the 31st Conference on Neural Information Processing Systems. Red Hook: Curran Associates, 2017: 6000-6010. |
| 9 | SAXTON D, GREFENSTETTE E, HILL F, et al. Analysing mathematical reasoning abilities of neural models [EB/OL]. (2019-04-02) [2023-10-18]. . |
| 10 | COBBE K, KOSARAJU V, BAVARIAN M, et al. Training verifiers to solve math word problems [EB/OL]. (2021-10-27)[2023-10-18]. . |
| 11 | LAMPLE G, CHARTON F. Deep learning for symbolic mathematics [EB/OL]. (2019-12-02) [2023-10-18]. |
| 12 | ALLAMANIS M, CHANTHIRASEGARAN P, KOHLI P, et al. Learning continuous semantic representations of symbolic expressions [C]// Proceedings of the 34th International Conference on Machine Learning. New York: JMLR.org, 2017: 80-88. |
| 13 | SCHNEIDEREIT T, BREUß M. Polynomial neural forms using feedforward neural networks for solving differential equations [C]// Proceedings of the 20th International Conference on Artificial Intelligence and Soft Computing. Cham: Springer, 2021: 236-245. |
| 14 | CHARTON F, HAYAT A, LAMPLE G. Learning advanced mathematical computations from examples [EB/OL]. (2020-06-11) [2023-10-18]. . |
| 15 | KAISER L, SUTSKEVER I. Neural GPUs learn algorithms [EB/OL]. (2015-11-25) [2023-10-18]. . |
| 16 | TRASK A, HILL F, REED S, et al. Neural arithmetic logic units [EB/OL]. (2018-08-01) [2024-01-09]. . |
| 17 | CHARTON F. Linear algebra with transformers [EB/OL]. (2021-12-03) [2023-10-18]. . |
| 18 | KOZA J R. Genetic programming as a means for programming computers by natural selection [J]. Statistics and Computing, 1994, 4: 87-112. |
| 19 | PETERSEN B K, LANDAJUELA M, MUNDHENK T N, et al. Deep symbolic regression: recovering mathematical expressions from data via risk-seeking policy gradients [EB/OL]. (2019-12-10) [2023-10-18]. . |
| 20 | MARTIUS G, LAMPERT C H. Extrapolation and learning equations [EB/OL]. (2016-10-10) [2023-10-18]. . |
| 21 | BIGGIO L, BENDINELLI T, NEITZ A, et al. Neural symbolic regression that scales [C]// Proceedings of the 38th International Conference on Machine Learning. New York: ML Research Press, 2021: 936-945. |
| 22 | RAGNI M, KLEIN A. Predicting numbers: an AI approach to solving number series [C]// Proceedings of the 34th Annual German Conference on Advances in Artificial Intelligence. Berlin: Springer, 2011: 255-259. |
| 23 | 蔡东风,朱耀辉,白宇.一种面向计数问题的公式发现方法[J].沈阳航空航天大学学报,2016,33(5):61-67. |
| CAI D F, ZHU Y H, BAI Y. Formula discovery method for counting problem [J]. Journal of Shenyang Aerospace University, 2016,33(5):61-67. | |
| 24 | 杨亚琴,蔡东风,白宇.基于多策略的整数数列学习[J].沈阳航空航天大学学报, 2018,35(2):52-59. |
| YANG Y Q, CAI D F, BAI Y. Multi-strategy based integer sequence learning [J]. Journal of Shenyang Aerospace University, 2018,35(2):52-59. | |
| 25 | NAM H, KIM S, JUNG K. Number sequence prediction problems for evaluating computational powers of neural networks [C]// Proceedings of the 33rd AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2019: 4626-4633. |
| 26 | WU C W. Can machine learning identify interesting mathematics? An exploration using empirically observed laws [EB/OL]. (2018-05-18)[2023-10-18]. . |
| 27 | GAUTHIER T, OLŠÁK M, URBAN J. Alien coding [EB/OL]. (2023-01-27) [2023-10-18]. . |
| 28 | M-R AMINI, FEOFANOV V, PAULETTO L, et al. Self-training: a survey [EB/OL]. (2022-02-24) [2023-10-18]. . |
| 29 | ANDRYCHOWICZ M, WOLSKI F, RAY A, et al. Hindsight experience replay [C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook: Curran Associates, 2017: 5055-5065. |
| 30 | OTT M, EDUNOV S, BAEVSKI A, et al. fairseq: a fast, extensible Toolkit for sequence modeling [EB/OL]. (2019-04-01) [2023-10-18]. . |
| 31 | QIAO S, SHEN W, ZHANG Z, et al. Deep co-training for semi-supervised image recognition [C]// Proceedings of the 15th European Conference on Computer Vision. Cham: Springer, 2018: 142-159. |
| [1] | 方介泼, 陶重犇. 应对零日攻击的混合车联网入侵检测系统[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2763-2769. |
| [2] | 贾洁茹, 杨建超, 张硕蕊, 闫涛, 陈斌. 基于自蒸馏视觉Transformer的无监督行人重识别[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2893-2902. |
| [3] | 任烈弘, 黄铝文, 田旭, 段飞. 基于DFT的频率敏感双分支Transformer多变量长时间序列预测方法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2739-2746. |
| [4] | 黄云川, 江永全, 黄骏涛, 杨燕. 基于元图同构网络的分子毒性预测[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2964-2969. |
| [5] | 杨鑫, 陈雪妮, 吴春江, 周世杰. 结合变种残差模型和Transformer的城市公路短时交通流预测[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2947-2951. |
| [6] | 李金金, 桑国明, 张益嘉. APK-CNN和Transformer增强的多域虚假新闻检测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2674-2682. |
| [7] | 丁宇伟, 石洪波, 李杰, 梁敏. 基于局部和全局特征解耦的图像去噪网络[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2571-2579. |
| [8] | 邓凯丽, 魏伟波, 潘振宽. 改进掩码自编码器的工业缺陷检测方法[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2595-2603. |
| [9] | 杨帆, 邹窈, 朱明志, 马振伟, 程大伟, 蒋昌俊. 基于图注意力Transformer神经网络的信用卡欺诈检测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2634-2642. |
| [10] | 李大海, 王忠华, 王振东. 结合空间域和频域信息的双分支低光照图像增强网络[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2175-2182. |
| [11] | 吕锡婷, 赵敬华, 荣海迎, 赵嘉乐. 基于Transformer和关系图卷积网络的信息传播预测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1760-1766. |
| [12] | 黎施彬, 龚俊, 汤圣君. 基于Graph Transformer的半监督异配图表示学习模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1816-1823. |
| [13] | 黄梦源, 常侃, 凌铭阳, 韦新杰, 覃团发. 基于层间引导的低光照图像渐进增强算法[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1911-1919. |
| [14] | 孙子文, 钱立志, 杨传栋, 高一博, 陆庆阳, 袁广林. 基于Transformer的视觉目标跟踪方法综述[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1644-1654. |
| [15] | 席治远, 唐超, 童安炀, 王文剑. 基于双路时空网络的驾驶员行为识别[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1511-1519. |
| 阅读次数 | ||||||
|
全文 |
|
|||||
|
摘要 |
|
|||||