Journal of Computer Applications ›› 2024, Vol. 44 ›› Issue (9): 2732-2738.DOI: 10.11772/j.issn.1001-9081.2023091301
• Data science and technology • Previous Articles Next Articles
Liting LI1, Bei HUA1(), Ruozhou HE1, Kuang XU2
Received:
2023-09-20
Revised:
2023-12-05
Accepted:
2023-12-11
Online:
2024-02-07
Published:
2024-09-10
Contact:
Bei HUA
About author:
LI Liting, born in 1999, M. S. candidate. His research interests include deep neural network, data mining, time series prediction.Supported by:
通讯作者:
华蓓
作者简介:
李力铤(1999—),男,浙江宁波人,硕士研究生,主要研究方向:深度神经网络、数据挖掘、时序预测基金资助:
CLC Number:
Liting LI, Bei HUA, Ruozhou HE, Kuang XU. Multivariate time series prediction model based on decoupled attention mechanism[J]. Journal of Computer Applications, 2024, 44(9): 2732-2738.
李力铤, 华蓓, 贺若舟, 徐况. 基于解耦注意力机制的多变量时序预测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2732-2738.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2023091301
模型 | TTV | ECL | PeMS-Bay | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
horizon 24 | horizon 48 | horizon 96 | horizon 24 | horizon 48 | horizon 96 | horizon 24 | horizon 48 | horizon 96 | ||||||||||
MAE | MSE | MAE | MSE | MAE | MSE | MAE | MSE | MAE | MSE | MAE | MSE | MAE | MSE | MAE | MSE | MAE | MSE | |
LSTNet | 0.151 | 0.054 | 0.141 | 0.051 | 0.194 | 0.134 | 0.248 | 0.177 | 0.288 | 0.208 | 0.302 | 0.220 | 0.331 | 0.540 | 0.353 | 0.597 | 0.382 | 0.642 |
MTGNN | 0.163 | 0.063 | 0.173 | 0.070 | 0.182 | 0.084 | 0.285 | 0.188 | 0.320 | 0.224 | 0.319 | 0.226 | 0.501 | 0.933 | 0.626 | 1.270 | 0.560 | 1.091 |
Transformer | 0.194 | 0.078 | 0.155 | 0.055 | 0.173 | 0.069 | 0.359 | 0.275 | 0.375 | 0.296 | 0.393 | 0.326 | 0.355 | 0.470 | 0.367 | 0.542 | 0.380 | 0.573 |
Informer | 0.190 | 0.077 | 0.158 | 0.058 | 0.171 | 0.067 | 0.367 | 0.286 | 0.422 | 0.378 | 0.474 | 0.447 | 0.361 | 0.485 | 0.402 | 0.625 | 0.450 | 0.744 |
Autoformer | 0.224 | 0.094 | 0.204 | 0.076 | 0.230 | 0.098 | 0.289 | 0.171 | 0.310 | 0.193 | 0.315 | 0.200 | 0.406 | 0.667 | 0.594 | 1.144 | 0.708 | 1.400 |
FEDformer | 0.224 | 0.089 | 0.157 | 0.050 | 0.171 | 0.059 | 0.288 | 0.173 | 0.288 | 0.173 | 0.307 | 0.195 | 0.364 | 0.505 | 0.403 | 0.586 | 0.420 | 0.611 |
Triformer | 0.219 | 0.098 | 0.155 | 0.053 | 0.175 | 0.063 | 0.304 | 0.213 | 0.332 | 0.249 | 0.343 | 0.268 | 0.337 | 0.498 | 0.415 | 0.663 | 0.548 | 0.945 |
Decformer | 0.116 | 0.038 | 0.116 | 0.039 | 0.141 | 0.055 | 0.221 | 0.144 | 0.239 | 0.159 | 0.255 | 0.184 | 0.266 | 0.398 | 0.307 | 0.502 | 0.328 | 0.538 |
Tab. 1 Comparison of prediction performance of different models on three datasets
模型 | TTV | ECL | PeMS-Bay | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
horizon 24 | horizon 48 | horizon 96 | horizon 24 | horizon 48 | horizon 96 | horizon 24 | horizon 48 | horizon 96 | ||||||||||
MAE | MSE | MAE | MSE | MAE | MSE | MAE | MSE | MAE | MSE | MAE | MSE | MAE | MSE | MAE | MSE | MAE | MSE | |
LSTNet | 0.151 | 0.054 | 0.141 | 0.051 | 0.194 | 0.134 | 0.248 | 0.177 | 0.288 | 0.208 | 0.302 | 0.220 | 0.331 | 0.540 | 0.353 | 0.597 | 0.382 | 0.642 |
MTGNN | 0.163 | 0.063 | 0.173 | 0.070 | 0.182 | 0.084 | 0.285 | 0.188 | 0.320 | 0.224 | 0.319 | 0.226 | 0.501 | 0.933 | 0.626 | 1.270 | 0.560 | 1.091 |
Transformer | 0.194 | 0.078 | 0.155 | 0.055 | 0.173 | 0.069 | 0.359 | 0.275 | 0.375 | 0.296 | 0.393 | 0.326 | 0.355 | 0.470 | 0.367 | 0.542 | 0.380 | 0.573 |
Informer | 0.190 | 0.077 | 0.158 | 0.058 | 0.171 | 0.067 | 0.367 | 0.286 | 0.422 | 0.378 | 0.474 | 0.447 | 0.361 | 0.485 | 0.402 | 0.625 | 0.450 | 0.744 |
Autoformer | 0.224 | 0.094 | 0.204 | 0.076 | 0.230 | 0.098 | 0.289 | 0.171 | 0.310 | 0.193 | 0.315 | 0.200 | 0.406 | 0.667 | 0.594 | 1.144 | 0.708 | 1.400 |
FEDformer | 0.224 | 0.089 | 0.157 | 0.050 | 0.171 | 0.059 | 0.288 | 0.173 | 0.288 | 0.173 | 0.307 | 0.195 | 0.364 | 0.505 | 0.403 | 0.586 | 0.420 | 0.611 |
Triformer | 0.219 | 0.098 | 0.155 | 0.053 | 0.175 | 0.063 | 0.304 | 0.213 | 0.332 | 0.249 | 0.343 | 0.268 | 0.337 | 0.498 | 0.415 | 0.663 | 0.548 | 0.945 |
Decformer | 0.116 | 0.038 | 0.116 | 0.039 | 0.141 | 0.055 | 0.221 | 0.144 | 0.239 | 0.159 | 0.255 | 0.184 | 0.266 | 0.398 | 0.307 | 0.502 | 0.328 | 0.538 |
模型变体 | MAE | MSE | 误差平均上升/% |
---|---|---|---|
Decformer | 0.116 | 0.039 | — |
使用自注意力 | 0.120 | 0.041 | 4.29 |
不进行模式嵌入预训练 | 0.118 | 0.040 | 2.14 |
去掉模式注意力模块 | 0.121 | 0.042 | 6.00 |
去掉时间注意力模块 | 0.118 | 0.041 | 3.43 |
Tab. 2 Performance comparison of Decformer and its variants
模型变体 | MAE | MSE | 误差平均上升/% |
---|---|---|---|
Decformer | 0.116 | 0.039 | — |
使用自注意力 | 0.120 | 0.041 | 4.29 |
不进行模式嵌入预训练 | 0.118 | 0.040 | 2.14 |
去掉模式注意力模块 | 0.121 | 0.042 | 6.00 |
去掉时间注意力模块 | 0.118 | 0.041 | 3.43 |
1 | ASHKBOOS S, HUANG L, DRYDEN N, et al. ENS-10: a dataset for post-processing ensemble weather forecast [C]// Proceedings of the 36th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2022: 21974-21987. |
2 | MATSUBARA Y, SAKURAI Y, VAN PANHUIS W G, et al. FUNNEL: automatic mining of spatially coevolving epidemics [C]// Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2014: 105-114. |
3 | DEB C, ZHANG F, YANG J, et al. A review on time series forecasting techniques for building energy consumption [J]. Renewable and Sustainable Energy Reviews, 2017, 74: 902-924. |
4 | 胡鹤轩,隋华超,胡强,等. 基于图注意力网络与双阶注意力机制的径流预报模型[J]. 计算机应用, 2022, 42(5): 1607-1615. |
HU H X, SUI H C, HU Q, et al. Runoff forecast model based on graph attention network and dual-stage attention mechanism [J]. Journal of Computer Applications, 2022, 42(5): 1607-1615. | |
5 | LI Y, YU R, SHAHABI C, et al. Diffusion convolutional recurrent neural network: data-driven traffic forecasting [EB/OL]. (2018-02-22) [2022-10-02].. |
6 | 夏进,王正群,朱世明. 基于时间序列分解的交通流量预测模型[J]. 计算机应用, 2023, 43(4): 1129-1135. |
XIA J, WANG Z Q, ZHU S M. Traffic flow prediction model based on time series decomposition[J]. Journal of Computer Applications, 2023, 43(4): 1129-1135. | |
7 | STAVROGLOU S K, PANTELOUS A A, STANLEY H E, et al. Hidden interactions in financial markets [J]. Proceedings of the National Academy of Sciences of the United States of America, 2019, 116(22): 10646-10651. |
8 | 李晓杰,崔超然,宋广乐,等. 基于时序超图卷积神经网络的股票趋势预测方法[J]. 计算机应用, 2022, 42(3): 797-803. |
LI X J, CUI C R, SONG G L, et al. Stock trend prediction method based on temporal hypergraph convolutional neural network [J]. Journal of Computer Applications, 2022, 42(3): 797-803. | |
9 | LAI G, CHANG W C, YANG Y, et al. Modeling long- and short-term temporal patterns with deep neural networks [C]// Proceedings of the 41st International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2018: 95-104. |
10 | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need [C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2017: 6000-6010. |
11 | ZHOU H, ZHANG S, PENG J, et al. Informer: beyond efficient Transformer for long sequence time-series forecasting [C]// Proceedings of the 35th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2021: 11106-11115. |
12 | CIRSTEA R G, GUO C, YANG B, et al. Triformer: triangular, variable-specific attentions for long sequence multivariate time series forecasting [C]// Proceedings of the 31st International Joint Conference on Artificial Intelligence. California: IJCAI.org, 2022: 1994-2001. |
13 | ZHOU T, MA Z, WEN Q, et al. FEDformer: frequency enhanced decomposed Transformer for long-term series forecasting [C]// Proceedings of the 39th International Conference on Machine Learning. New York: JMLR.org, 2022: 27268-27286. |
14 | WU H, XU J, WANG J, et al. Autoformer: decomposition Transformers with auto-correlation for long-term series forecasting[C]// Proceedings of the 35th Conference on Neural Information Processing Systems.Red Hook: Curran Associates Inc., 2021: 22419-22430. |
15 | DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional Transformers for language understanding [C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Stroudsburg: ACL, 2019: 4171-4186. |
16 | RADFORD A, NARASIMHAN K, SALIMANS T, et al. Improving language understanding by generative pre-training [EB/OL]. (2018-06-11) [2023-02-10]. . |
17 | WU Z, PAN S, LONG G, et al. Graph WaveNet for deep spatial-temporal graph modeling [C]// Proceedings of the 28th International Joint Conference on Artificial Intelligence. California: IJCAI.org, 2019: 1907-1913. |
18 | ZHENG C, FAN X, WANG C, et al. GMAN: a graph multi-attention network for traffic prediction [C]// Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2020: 1234-1241. |
19 | BAHDANAU D, CHO K, BENGIO Y. Neural machine translation by jointly learning to align and translate [EB/OL]. (2016-05-19) [2022-11-02]. . |
20 | SHIH S Y, SUN F K, LEE H Y. Temporal pattern attention for multivariate time series forecasting [J]. Machine Learning, 2019, 108(8/9): 1421-1441. |
21 | LIN H, GAO Z, XU Y, et al. Conditional local convolution for spatio-temporal meteorological forecasting [C]// Proceedings of the 36th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2022: 7470-7478. |
22 | WU Z, PAN S, LONG G, et al. Connecting the dots: multivariate time series forecasting with graph neural networks [C]// Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2020: 753-763. |
23 | HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition [C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 770-778. |
24 | BA J L, KIROS J R, HINTON G E. Layer normalization [EB/OL]. (2016-07-21) [2022-12-03]. . |
25 | SAKOE H, CHIBA S. Dynamic programming algorithm optimization for spoken word recognition [J]. IEEE Transactions on Acoustics, Speech, and Signal Processing, 1978, 26(1): 43-49. |
26 | SHUMAN D I, NARANG S K, FROSSARD P, et al. The emerging field of signal processing on graphs: extending high-dimensional data analysis to networks and other irregular domains[J]. IEEE Signal Processing Magazine, 2013, 30(3): 83-98. |
27 | GROVER A, LESKOVEC J. node2vec: scalable feature learning for networks [C]// Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2016: 855-864. |
28 | PASZKE A, GROSS S, MASSA F, et al. PyTorch: an imperative style, high-performance deep learning library [C]// Proceedings of the 33rd International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2019: 8026-8037. |
29 | KINGMA D P, BA J L. Adam: a method for stochastic optimization[EB/OL]. (2017-01-30) [2022-09-13]. . |
30 | VAN DER MAATEN L, HINTON G. Visualizing data using t-SNE[J]. Journal of Machine Learning Research, 2008, 9: 2579-2605. |
[1] | Yun LI, Fuyou WANG, Peiguang JING, Su WANG, Ao XIAO. Uncertainty-based frame associated short video event detection method [J]. Journal of Computer Applications, 2024, 44(9): 2903-2910. |
[2] | Jing QIN, Zhiguang QIN, Fali LI, Yueheng PENG. Diagnosis of major depressive disorder based on probabilistic sparse self-attention neural network [J]. Journal of Computer Applications, 2024, 44(9): 2970-2974. |
[3] | Zexin XU, Lei YANG, Kangshun LI. Shorter long-sequence time series forecasting model [J]. Journal of Computer Applications, 2024, 44(6): 1824-1831. |
[4] | Yue LIU, Fang LIU, Aoyun WU, Qiuyue CHAI, Tianxiao WANG. 3D object detection network based on self-attention mechanism and graph convolution [J]. Journal of Computer Applications, 2024, 44(6): 1972-1977. |
[5] | Rong HUANG, Junjie SONG, Shubo ZHOU, Hao LIU. Image aesthetic quality evaluation method based on self-supervised vision Transformer [J]. Journal of Computer Applications, 2024, 44(4): 1269-1276. |
[6] | Xinran LUO, Tianrui LI, Zhen JIA. Chinese medical named entity recognition based on self-attention mechanism and lexicon enhancement [J]. Journal of Computer Applications, 2024, 44(2): 385-392. |
[7] | Ziqi HUANG, Jianpeng HU. Entity category enhanced nested named entity recognition in automotive domain [J]. Journal of Computer Applications, 2024, 44(2): 377-384. |
[8] | Liqing QIU, Xiaopan SU. Personalized multi-layer interest extraction click-through rate prediction model [J]. Journal of Computer Applications, 2024, 44(11): 3411-3418. |
[9] | Xingyao YANG, Hongtao SHEN, Zulian ZHANG, Jiong YU, Jiaying CHEN, Dongxiao WANG. Sequential recommendation based on hierarchical filter and temporal convolution enhanced self-attention network [J]. Journal of Computer Applications, 2024, 44(10): 3090-3096. |
[10] | Yanbo LI, Qing HE, Shunyi LU. Aspect sentiment triplet extraction integrating semantic and syntactic information [J]. Journal of Computer Applications, 2024, 44(10): 3275-3280. |
[11] | Hanxiao SHI, Leichun WANG. Short-term power load forecasting by graph convolutional network combining LSTM and self-attention mechanism [J]. Journal of Computer Applications, 2024, 44(1): 311-317. |
[12] | Jia CHEN, Hong ZHANG. Image text retrieval method based on feature enhancement and semantic correlation matching [J]. Journal of Computer Applications, 2024, 44(1): 16-23. |
[13] | Li’an CHEN, Yi GUO. Text sentiment analysis model based on individual bias information [J]. Journal of Computer Applications, 2024, 44(1): 145-151. |
[14] | Guolong YUAN, Yujin ZHANG, Yang LIU. Image tampering forensics network based on residual feedback and self-attention [J]. Journal of Computer Applications, 2023, 43(9): 2925-2931. |
[15] | Yuan WEI, Yan LIN, Shengnan GUO, Youfang LIN, Huaiyu WAN. Prediction of taxi demands between urban regions by fusing origin-destination spatial-temporal correlation [J]. Journal of Computer Applications, 2023, 43(7): 2100-2106. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||