Time series forecasting model based on segmented attention mechanism

doi:10.11772/j.issn.1001-9081.2024070929

Journal of Computer Applications ›› 2025, Vol. 45 ›› Issue (7): 2262-2268.DOI: 10.11772/j.issn.1001-9081.2024070929

• Data science and technology • Previous Articles Next Articles

Time series forecasting model based on segmented attention mechanism

Huibin WANG¹(), Zhan’ao HU², Jie HU², Yuanwei XU¹, Bo WEN¹

^1.State Grid Sichuan Minjiang Power Supply Company Limited，Chengdu Sichuan 611830，China
^2.School of Computing and Artificial Intelligence，Southwest Jiaotong University，Chengdu Sichuan 611756，China

Received:2024-07-03 Revised:2024-10-18 Accepted:2024-10-22 Online:2025-07-10 Published:2025-07-10
Contact: Huibin WANG
About author:HU Zhan’ao， born in 1999， M. S. His research interests include time series forecasting.
HU Jie， born in 1978， Ph. D.， associate professor. Her research interests include time series forecasting.
XU Yuanwei， born in 1983， engineer. His research interests include early warning analysis of power equipment.
WEN Bo， born in 1990， engineer. His research interests include power engineering and its automation.
Supported by:
Sichuan Electric Power Company Conventional Technology Project(5219T8230001)

基于分段注意力机制的时间序列预测模型

王慧斌¹(), 胡展傲², 胡节², 徐袁伟¹, 文博¹

^1.国网四川岷江供电有限责任公司，成都 611830
^2.西南交通大学计算机与人工智能学院，成都 611756

通讯作者: 王慧斌
作者简介:胡展傲（1999—），男，四川成都人，硕士，主要研究方向：时间序列预测
胡节（1978—），女，四川成都人，副教授，博士，CCF会员，主要研究方向：时间序列预测
徐袁伟（1983—），男，重庆人，工程师，主要研究方向：电力设备预警分析
文博（1990—），男，四川三台人，工程师，主要研究方向：电力工程及其自动化。
基金资助:
四川省电力公司常规科技项目(5219T8230001)

Abstract

Abstract:

To address the issue of local dependency loss during long-term forecasting due to increased sampling interval after time series segmentation， a time series forecasting model based on Segmented Attention Mechanism （SAMformer） was proposed. Firstly， time static covariates were fused with original data in proportion explicitly to enhance time domain information representation ability of the data. Secondly， two continuous linear layers with bias and an activation function were introduced to fine-tune the fused data， thereby improving the model’s ability to fit nonlinear data. Thirdly， a dot product attention mechanism was introduced in each segment of the segmented series to capture local feature dependencies. Finally， a cross-scale dependency based encoder-decoder architecture was utilized to predict time series data. Several experiments of the proposed model were carried out on five public time series datasets， and the results show that compared with other supervised learning based time series forecasting models， Crossformer， Pyraformer， and Informer， SAMformer reduces the Mean Squared Error （MSE） and Mean Absolute Error （MAE） by 2.0%-62.0% and 0.9%-49.8% respectively. Besides， through ablation experiments， the completeness and effectiveness of the proposed different components are verified， which further shows that fusion of time domain information and intra-segment attention mechanism is helpful to improve the accuracy of time series forecasting.

Key words: Deep Neural Network (DNN), time series forecasting, time domain information fusion, encoder-decoder architecture, attention mechanism

摘要：

针对时间序列分段后存在因采样间隔增大而导致的长期预测过程中局部依赖关系丢失的情况，提出一种基于分段注意力机制的时间序列预测模型（SAMformer）。首先，显式地将时间静态协变量与原始数据按比例融合，以增强数据的时域信息表征能力；其次，同时引入两个连续的带偏置的线性层和一个激活函数来微调融合数据，从而提高模型对非线性数据的拟合能力；然后，在分段序列的每个段内引入点积注意力机制，以便捕获局部特征依赖关系；最后，利用跨尺度依赖的编码器-解码器架构预测时序数据。所提模型在公开的5个时间序列数据集上的实验结果表明，相较于Crossformer、 Pyraformer和Informer等其他监督学习时序预测模型，SAMformer的均方误差（MSE）和平均绝对误差（MAE）分别降低了2.0%～62.0%和0.9%～49.8%。此外，通过消融实验验证了所提不同组件的完备性和有效性，进一步说明了融合时域信息和段内注意力机制有助于提高时间序列预测的精度。

关键词: 深度神经网络, 时间序列预测, 时域信息融合, 编码器-解码器架构, 注意力机制

CLC Number:

O211.61

Huibin WANG, Zhan’ao HU, Jie HU, Yuanwei XU, Bo WEN. Time series forecasting model based on segmented attention mechanism[J]. Journal of Computer Applications, 2025, 45(7): 2262-2268.

王慧斌, 胡展傲, 胡节, 徐袁伟, 文博. 基于分段注意力机制的时间序列预测模型[J]. 《计算机应用》唯一官方网站, 2025, 45(7): 2262-2268.

Figures/Tables 7

References 21

[1]	SRIRAMALAKSHMI P， SUBHASREE V， VONDIVILLU S T， et al. Time series analysis and forecasting of wind turbine data ［C］// Proceedings of the 2022 International Virtual Conference on Power Engineering Computing and Control： Developments in Electric Vehicles and Energy Sector for Sustainable Future. Piscataway： IEEE， 2022： 1-9.
[2]	HERNANDEZ-MATAMOROS A， FUJITA H， HAYASHI T， et al. Forecasting of COVID19 per regions using ARIMA models and polynomial functions ［J］. Applied Soft Computing， 2020， 96： No.106610.
[3]	DASH A， YE J， WANG G. A review of Generative Adversarial Networks （GANs） and its applications in a wide variety of disciplines： from medical to remote sensing ［J］. IEEE Access， 2024， 12： 18330-18357.
[4]	KAMALOV F， RAJAB K， CHERUKURI A K， et al. Deep learning for Covid-19 forecasting： state-of-the-art review ［J］. Neurocomputing， 2022， 511： 142-154.
[5]	刘永乐，谷远利.基于CNN-BiLSTM的高速公路交通流量时空特性预测［J］.交通科技与经济，2022， 24（1）： 9-18.
	LIU Y L， GU Y L. Prediction of temporal and spatial characteristics of freeway traffic flow based on CNN-BiLSTM ［J］. Technology and Economy in Areas of Communications， 2022， 24（1）： 9-18.
[6]	GASPARIN A， LUKOVIC S， ALIPPI C. Deep learning for time series forecasting： the electric load case ［J］. CAAI Transactions on Intelligence Technology， 2022， 7（1）： 1-25.
[7]	VASWANI A， SHAZEER N， PARMAR N， et al. Attention is all you need ［C］// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook： Curran Associates， 2017： 6000-6010.
[8]	LIM B， ARIK S Ö， LOEFF N， et al. Temporal Fusion Transformers for interpretable multi-horizon time series forecasting ［J］. International Journal of Forecasting， 2021， 37（4）： 1748-1764.
[9]	LI S， JIN X， XUAN Y， et al. Enhancing the locality and breaking the memory bottleneck of transformer on time series forecasting ［C］// Proceedings of the 33rd International Conference on Neural Information Processing Systems. Red Hook： Curran Associates， 2019： 5243-5253.
[10]	ZHOU H， ZHANG S， PENG J， et al. Informer： beyond efficient transformer for long sequence time-series forecasting ［C］// Proceedings of the 35th AAAI Conference on Artificial Intelligence. Palo Alto： AAAI Press， 2021： 11106-11115.
[11]	ZHANG Y， YAN J. Crossformer： Transformer utilizing cross-dimension dependency for multivariate time series forecasting ［EB/OL］. ［2024-05-24］. .
[12]	ZENG A， CHEN M， ZHANG L， et al. Are transformers effective for time series forecasting？［C］// Proceedings of the 37th AAAI Conference on Artificial Intelligence. Palo Alto： AAAI Press， 2023： 11121-11128.
[13]	WU S， XIAO X， DING Q， et al. Adversarial sparse transformer for time series forecasting ［C］// Proceedings of the 34th International Conference on Neural Information Processing Systems. Red Hook： Curran Associates， 2020： 17105-17115.
[14]	WU H， XU J， WANG J， et al. Autoformer： decomposition transformers with auto-correlation for long-term series forecasting ［C］// Proceedings of the 35th International Conference on Neural Information Processing Systems. Red Hook： Curran Associates， 2021： 22419-22430.
[15]	ZHOU T， MA Z， WEN Q， et al. FEDformer： frequency enhanced decomposed transformer for long-term series forecasting ［C］// Proceedings of the 39th International Conference on Machine Learning. New York： JMLR.org， 2022： 27268-27286.
[16]	LIU S， YU H， LIAO C， et al. Pyraformer： low-complexity pyramidal attention for long-range time series modeling and forecasting ［EB/OL］. ［2024-05-24］. .
[17]	CIRSTEA R G， GUO C， YANG B， et al. Triformer： triangular， variable-specific attentions for long sequence multivariate time series forecasting ［C］// Proceedings of the 31st International Joint Conference on Artificial Intelligence. California： ijcai.org， 2022： 1994-2001.
[18]	HYNDMAN R， KOEHLER A B， ORD J K， et al. Forecasting with exponential smoothing： the state space approach， SSS ［M］. Berlin： Springer， 2008： 229-251.
[19]	DUDEK G， PEŁKA P， SMYL S. A hybrid residual dilated LSTM and exponential smoothing model for midterm electric load forecasting ［J］. IEEE Transactions on Neural Networks and Learning Systems， 2022， 33（7）： 2879-2891.
[20]	PASZKE A， GROSS S， MASSA F， et al. PyTorch： an imperative style， high-performance deep learning library ［C］// Proceedings of the 33rd International Conference on Neural Information Processing Systems. Red Hook： Curran Associates， 2019： 8026-8037.
[21]	KITAEV N， KAISER Ł， LEVSKAYA A. Reformer： the efficient Transformer ［EB/OL］. ［2024-05-24］. .

数据集	样本数	特征数	采样间隔
ETTh1	17 420	7	1 h
ETTh2	17 420	7	1 h
ETTm1	69 680	7	15 min
ILI	966	7	1 week
WTH	35 065	12	1 h

数据集	样本数	特征数	采样间隔
ETTh1	17 420	7	1 h
ETTh2	17 420	7	1 h
ETTm1	69 680	7	15 min
ILI	966	7	1 week
WTH	35 065	12	1 h

数据集	预测步长	本文模型		Crossformer^［11］		Pyraformer^［16］		Informer^［10］		Reformer^［21］
数据集	预测步长	MSE	MAE	MSE	MAE	MSE	MAE	MSE	MAE	MSE	MAE
ETTh1	24	0.303	0.364	0.311	0.368	0.493	0.507	0.577	0.549	0.991	0.754
	48	0.370	0.407	0.371	0.413	0.554	0.544	0.685	0.625	1.313	0.906
	168	0.570	0.535	0.576	0.541	0.781	0.675	0.931	0.752	1.824	1.138
	336	0.719	0.614	0.722	0.628	0.912	0.747	1.128	0.873	2.117	1.280
	720	0.975	0.768	1.015	0.791	0.993	0.792	1.215	0.896	2.415	1.520
ETTh2	96	0.518	0.505	0.745	0.584	0.645	0.597	3.755	1.525	2.626	1.317
	192	0.830	0.620	0.877	0.656	0.788	0.683	5.602	1.931	11.120	2.979
	336	0.826	0.645	1.043	0.731	0.907	0.747	4.721	1.835	9.323	2.769
	720	0.941	0.689	1.104	0.763	0.963	0.783	3.647	1.625	3.874	1.697
ETTm1	24	0.232	0.299	0.234	0.302	0.310	0.371	0.323	0.369	0.724	0.607
	48	0.295	0.357	0.330	0.377	0.465	0.464	0.494	0.503	1.098	0.777
	96	0.365	0.411	0.412	0.436	0.520	0.504	0.678	0.614	1.433	0.945
	288	0.446	0.468	0.517	0.517	0.729	0.657	1.056	0.786	1.820	1.094
	672	0.658	0.623	0.755	0.672	0.980	0.678	1.192	0.926	2.187	1.232
ILI	24	3.391	1.256	3.396	1.190	3.970	1.338	4.588	1.462	4.400	1.382
	36	3.499	1.228	3.495	1.232	4.337	1.410	4.845	1.496	4.783	1.448
	48	3.680	1.261	3.716	1.250	4.811	1.503	4.865	1.516	4.832	1.465
	60	3.885	1.299	3.961	1.305	5.204	1.588	5.212	1.576	4.882	1.483
WTH	24	0.293	0.351	0.294	0.343	0.301	0.359	0.335	0.381	0.655	0.583
	48	0.365	0.413	0.370	0.411	0.376	0.421	0.395	0.459	0.729	0.666
	168	0.506	0.512	0.507	0.515	0.519	0.521	0.608	0.567	1.318	0.855
	336	0.534	0.535	0.536	0.528	0.539	0.543	0.702	0.620	1.930	1.167
	720	0.576	0.571	0.590	0.569	0.547	0.553	0.831	0.731	2.726	1.575
平均		1.140	0.646	1.164	0.652	1.439	0.746	1.614	0.826	3.005	1.289

数据集	预测步长	本文模型		Crossformer^［11］		Pyraformer^［16］		Informer^［10］		Reformer^［21］
数据集	预测步长	MSE	MAE	MSE	MAE	MSE	MAE	MSE	MAE	MSE	MAE
ETTh1	24	0.303	0.364	0.311	0.368	0.493	0.507	0.577	0.549	0.991	0.754
	48	0.370	0.407	0.371	0.413	0.554	0.544	0.685	0.625	1.313	0.906
	168	0.570	0.535	0.576	0.541	0.781	0.675	0.931	0.752	1.824	1.138
	336	0.719	0.614	0.722	0.628	0.912	0.747	1.128	0.873	2.117	1.280
	720	0.975	0.768	1.015	0.791	0.993	0.792	1.215	0.896	2.415	1.520
ETTh2	96	0.518	0.505	0.745	0.584	0.645	0.597	3.755	1.525	2.626	1.317
	192	0.830	0.620	0.877	0.656	0.788	0.683	5.602	1.931	11.120	2.979
	336	0.826	0.645	1.043	0.731	0.907	0.747	4.721	1.835	9.323	2.769
	720	0.941	0.689	1.104	0.763	0.963	0.783	3.647	1.625	3.874	1.697
ETTm1	24	0.232	0.299	0.234	0.302	0.310	0.371	0.323	0.369	0.724	0.607
	48	0.295	0.357	0.330	0.377	0.465	0.464	0.494	0.503	1.098	0.777
	96	0.365	0.411	0.412	0.436	0.520	0.504	0.678	0.614	1.433	0.945
	288	0.446	0.468	0.517	0.517	0.729	0.657	1.056	0.786	1.820	1.094
	672	0.658	0.623	0.755	0.672	0.980	0.678	1.192	0.926	2.187	1.232
ILI	24	3.391	1.256	3.396	1.190	3.970	1.338	4.588	1.462	4.400	1.382
	36	3.499	1.228	3.495	1.232	4.337	1.410	4.845	1.496	4.783	1.448
	48	3.680	1.261	3.716	1.250	4.811	1.503	4.865	1.516	4.832	1.465
	60	3.885	1.299	3.961	1.305	5.204	1.588	5.212	1.576	4.882	1.483
WTH	24	0.293	0.351	0.294	0.343	0.301	0.359	0.335	0.381	0.655	0.583
	48	0.365	0.413	0.370	0.411	0.376	0.421	0.395	0.459	0.729	0.666
	168	0.506	0.512	0.507	0.515	0.519	0.521	0.608	0.567	1.318	0.855
	336	0.534	0.535	0.536	0.528	0.539	0.543	0.702	0.620	1.930	1.167
	720	0.576	0.571	0.590	0.569	0.547	0.553	0.831	0.731	2.726	1.575
平均		1.140	0.646	1.164	0.652	1.439	0.746	1.614	0.826	3.005	1.289

模型	MSE		MAE
模型	平均	下降比例/%	平均	下降比例/%
平均	1.805	36.8	0.878	26.4
Crossformer^［11］	1.164	2.0	0.652	0.9
Pyraformer^［16］	1.439	20.7	0.746	13.4
Informer^［10］	1.614	29.3	0.826	21.7
Reformer^［21］	3.005	62.0	1.289	49.8

Time series forecasting model based on segmented attention mechanism

基于分段注意力机制的时间序列预测模型

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 7

References 21

Related Articles 15

Recommended Articles

Metrics

预测步长	batch_size=64		batch_size=32		batch_size=16
预测步长	MSE	MAE	MSE	MAE	MSE	MAE
24	0.293	0.351	0.336	0.399	0.336	0.407
48	0.365	0.413	0.410	0.466	0.416	0.464
168	0.506	0.512	0.597	0.606	0.580	0.578
336	0.534	0.535	0.605	0.573	0.640	0.597
720	0.576	0.571	0.668	0.609	0.637	0.588

模型	MSE	MAE
SAMformer	3.021	1.192
w/o静态协变量嵌入	3.446	1.211
w/o连续线性层和激活层	3.483	1.217
w/o时域段内点积注意力机制	3.424	1.196

[1]	Chao JING, Yutao QUAN, Yan CHEN. Improved multi-layer perceptron and attention model-based power consumption prediction algorithm [J]. Journal of Computer Applications, 2025, 45(8): 2646-2655.
[2]	Jinhao LIN, Chuan LUO, Tianrui LI, Hongmei CHEN. Thoracic disease classification method based on cross-scale attention network [J]. Journal of Computer Applications, 2025, 45(8): 2712-2719.
[3]	Haifeng WU, Liqing TAO, Yusheng CHENG. Partial label regression algorithm integrating feature attention and residual connection [J]. Journal of Computer Applications, 2025, 45(8): 2530-2536.
[4]	Jin ZHOU, Yuzhi LI, Xu ZHANG, Shuo GAO, Li ZHANG, Jiachuan SHENG. Modulation recognition network for complex electromagnetic environments [J]. Journal of Computer Applications, 2025, 45(8): 2672-2682.
[5]	Yihan WANG, Chong LU, Zhongyuan CHEN. Multimodal sentiment analysis model with cross-modal text information enhancement [J]. Journal of Computer Applications, 2025, 45(7): 2237-2244.
[6]	Erkang XIANG, Rong HUANG, Aihua DONG. Open set recognition method with open generation and feature optimization [J]. Journal of Computer Applications, 2025, 45(7): 2195-2202.
[7]	Chen LIANG, Yisen WANG, Qiang WEI, Jiang DU. Source code vulnerability detection method based on Transformer-GCN [J]. Journal of Computer Applications, 2025, 45(7): 2296-2303.
[8]	Haoyu LIU, Pengwei KONG, Yaoli WANG, Qing CHANG. Pedestrian detection algorithm based on multi-view information [J]. Journal of Computer Applications, 2025, 45(7): 2325-2332.
[9]	Xiaoqiang ZHAO, Yongyong LIU, Yongyong HUI, Kai LIU. Batch process quality prediction model using improved time-domain convolutional network with multi-head self-attention mechanism [J]. Journal of Computer Applications, 2025, 45(7): 2245-2252.
[10]	Weigang LI, Xinyi LI, Yongqiang WANG, Yuntao ZHAO. Point cloud classification and segmentation method based on adaptive dynamic graph convolution and parameter-free attention [J]. Journal of Computer Applications, 2025, 45(6): 1980-1986.
[11]	Yuan SONG, Xin CHEN, Yarong LI, Yongwei LI, Yang LIU, Zhen ZHAO. Single-channel speech separation model based on auditory modulation Siamese network [J]. Journal of Computer Applications, 2025, 45(6): 2025-2033.
[12]	Haijie WANG, Guangxin ZHANG, Hai SHI, Shu CHEN. Document-level relation extraction based on entity representation enhancement [J]. Journal of Computer Applications, 2025, 45(6): 1809-1816.
[13]	Sheping ZHAI, Yan HUANG, Qing YANG, Rui YANG. Multi-view entity alignment combining triples and text attributes [J]. Journal of Computer Applications, 2025, 45(6): 1793-1800.
[14]	Xiang WANG, Qianqian CUI, Xiaoming ZHANG, Jianchao WANG, Zhenzhou WANG, Jialin SONG. Wireless capsule endoscopy image classification model based on improved ConvNeXt [J]. Journal of Computer Applications, 2025, 45(6): 2016-2024.
[15]	Man CHEN, Xiaojun YANG, Huimin YANG. Pedestrian trajectory prediction based on graph convolutional network and endpoint induction [J]. Journal of Computer Applications, 2025, 45(5): 1480-1487.