Journal of Computer Applications ›› 2025, Vol. 45 ›› Issue (8): 2582-2591.DOI: 10.11772/j.issn.1001-9081.2024071046
• Data science and technology • Previous Articles
Xingjie FENG1, Xingpeng BIAN1, Xiaorong FENG2(), Xinglong WANG2
Received:
2024-07-26
Revised:
2024-09-29
Accepted:
2024-10-11
Online:
2024-11-19
Published:
2025-08-10
Contact:
Xiaorong FENG
About author:
FENG Xingjie, born in 1969, Ph. D., professor. His research interests include data warehouse, intelligent information processing.Supported by:
通讯作者:
冯小荣
作者简介:
冯兴杰(1969—),男,河北邢台人,教授,博士,主要研究方向:数据仓库、智能信息处理基金资助:
CLC Number:
Xingjie FENG, Xingpeng BIAN, Xiaorong FENG, Xinglong WANG. Incremental missing value imputation algorithm for time series based on diffusion model[J]. Journal of Computer Applications, 2025, 45(8): 2582-2591.
冯兴杰, 卞兴鹏, 冯小荣, 王兴隆. 基于扩散模型的增量式时间序列缺失值填充算法[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2582-2591.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2024071046
算法 | 算法运行时刻的真实值状态 | 算法运行时的模型生成值状态 |
---|---|---|
预测算法 | 真实值不存在 | 预测值不可评价 |
填充算法 | 真实值已存在,不可观测 | 填充值可间接评价 |
Tab. 1 Difference between imputation algorithms and prediction algorithms
算法 | 算法运行时刻的真实值状态 | 算法运行时的模型生成值状态 |
---|---|---|
预测算法 | 真实值不存在 | 预测值不可评价 |
填充算法 | 真实值已存在,不可观测 | 填充值可间接评价 |
数据集 | 采样点数 | 维度 | 原始缺失率/% |
---|---|---|---|
AQI | 8 760 | 36 | 13.3 |
ETT | 17 421 | 6 | 0.0 |
Weather | 52 697 | 21 | 0.0 |
Tab. 2 Statistical results of datasets
数据集 | 采样点数 | 维度 | 原始缺失率/% |
---|---|---|---|
AQI | 8 760 | 36 | 13.3 |
ETT | 17 421 | 6 | 0.0 |
Weather | 52 697 | 21 | 0.0 |
数据集 | batch_size | epoch | loss | learning_rate | diff_steps | res_channels | n_samples |
---|---|---|---|---|---|---|---|
AQI | 16 | 100 | huber | 0.001 0 | 50 | 64 | 500 |
ETT | 16 | 100 | huber | 0.000 5 | 50 | 64 | 300 |
Weather | 16 | 100 | huber | 0.000 5 | 50 | 64 | 500 |
Tab. 3 I2TDM hyperparameter setting
数据集 | batch_size | epoch | loss | learning_rate | diff_steps | res_channels | n_samples |
---|---|---|---|---|---|---|---|
AQI | 16 | 100 | huber | 0.001 0 | 50 | 64 | 500 |
ETT | 16 | 100 | huber | 0.000 5 | 50 | 64 | 300 |
Weather | 16 | 100 | huber | 0.000 5 | 50 | 64 | 500 |
缺失值填充比例/% | 指标 | Median | BRITS | GAIN | SAITS | CSDI | SSSD | PriSTI | I2TDM |
---|---|---|---|---|---|---|---|---|---|
10 | MAE | 65.83 | 12.47 | 26.99 | 10.29 | 7.13 | 11.96 | 7.90 | 6.83 |
RMSE | 93.47 | 21.29 | 57.17 | 18.81 | 12.71 | 20.22 | 14.95 | 12.12 | |
20 | MAE | 66.12 | 13.52 | 26.90 | 10.60 | 7.39 | 12.70 | 8.74 | 7.12 |
RMSE | 92.07 | 22.63 | 57.07 | 19.31 | 13.22 | 21.51 | 16.90 | 12.53 | |
50 | MAE | 67.13 | 18.86 | 27.50 | 12.34 | 8.64 | 14.73 | 12.04 | 8.40 |
RMSE | 106.02 | 31.69 | 57.32 | 22.37 | 15.68 | 25.35 | 23.95 | 15.01 | |
90 | MAE | 80.22 | 41.75 | 32.83 | 17.06 | 14.27 | 28.47 | 33.27 | 14.12 |
RMSE | 124.21 | 62.27 | 60.91 | 30.33 | 24.70 | 46.34 | 59.44 | 24.74 |
Tab. 4 Experimental results of missing value imputation on AQI dataset
缺失值填充比例/% | 指标 | Median | BRITS | GAIN | SAITS | CSDI | SSSD | PriSTI | I2TDM |
---|---|---|---|---|---|---|---|---|---|
10 | MAE | 65.83 | 12.47 | 26.99 | 10.29 | 7.13 | 11.96 | 7.90 | 6.83 |
RMSE | 93.47 | 21.29 | 57.17 | 18.81 | 12.71 | 20.22 | 14.95 | 12.12 | |
20 | MAE | 66.12 | 13.52 | 26.90 | 10.60 | 7.39 | 12.70 | 8.74 | 7.12 |
RMSE | 92.07 | 22.63 | 57.07 | 19.31 | 13.22 | 21.51 | 16.90 | 12.53 | |
50 | MAE | 67.13 | 18.86 | 27.50 | 12.34 | 8.64 | 14.73 | 12.04 | 8.40 |
RMSE | 106.02 | 31.69 | 57.32 | 22.37 | 15.68 | 25.35 | 23.95 | 15.01 | |
90 | MAE | 80.22 | 41.75 | 32.83 | 17.06 | 14.27 | 28.47 | 33.27 | 14.12 |
RMSE | 124.21 | 62.27 | 60.91 | 30.33 | 24.70 | 46.34 | 59.44 | 24.74 |
缺失值填充比例/% | 指标 | Median | BRITS | GAIN | SAITS | CSDI | SSSD | PriSTI | I2TDM |
---|---|---|---|---|---|---|---|---|---|
10 | MAE | 2.53 | 0.47 | 1.09 | 0.35 | 0.25 | 0.51 | 0.47 | 0.23 |
RMSE | 4.63 | 1.05 | 2.80 | 0.97 | 0.48 | 1.17 | 0.91 | 0.43 | |
20 | MAE | 2.57 | 0.54 | 1.13 | 0.38 | 0.29 | 0.54 | 0.52 | 0.27 |
RMSE | 4.52 | 1.18 | 2.83 | 1.00 | 0.61 | 1.17 | 1.02 | 0.52 | |
50 | MAE | 2.94 | 0.83 | 1.45 | 0.52 | 0.44 | 0.81 | 0.63 | 0.41 |
RMSE | 4.71 | 1.66 | 3.22 | 1.18 | 1.00 | 1.89 | 1.26 | 0.90 | |
90 | MAE | 3.72 | 2.29 | 3.21 | 1.17 | 1.07 | 1.83 | 1.51 | 1.14 |
RMSE | 5.43 | 4.21 | 5.74 | 2.49 | 2.33 | 3.51 | 3.26 | 2.38 |
Tab. 5 Experimental results of missing value imputation on ETT-h1 dataset
缺失值填充比例/% | 指标 | Median | BRITS | GAIN | SAITS | CSDI | SSSD | PriSTI | I2TDM |
---|---|---|---|---|---|---|---|---|---|
10 | MAE | 2.53 | 0.47 | 1.09 | 0.35 | 0.25 | 0.51 | 0.47 | 0.23 |
RMSE | 4.63 | 1.05 | 2.80 | 0.97 | 0.48 | 1.17 | 0.91 | 0.43 | |
20 | MAE | 2.57 | 0.54 | 1.13 | 0.38 | 0.29 | 0.54 | 0.52 | 0.27 |
RMSE | 4.52 | 1.18 | 2.83 | 1.00 | 0.61 | 1.17 | 1.02 | 0.52 | |
50 | MAE | 2.94 | 0.83 | 1.45 | 0.52 | 0.44 | 0.81 | 0.63 | 0.41 |
RMSE | 4.71 | 1.66 | 3.22 | 1.18 | 1.00 | 1.89 | 1.26 | 0.90 | |
90 | MAE | 3.72 | 2.29 | 3.21 | 1.17 | 1.07 | 1.83 | 1.51 | 1.14 |
RMSE | 5.43 | 4.21 | 5.74 | 2.49 | 2.33 | 3.51 | 3.26 | 2.38 |
缺失值填充比例/% | 指标 | Median | BRITS | GAIN | SAITS | CSDI | SSSD | PriSTI | I2TDM |
---|---|---|---|---|---|---|---|---|---|
10 | MAE | 66.23 | 6.61 | 20.37 | 3.96 | 3.02 | 6.67 | 5.68 | 2.87 |
RMSE | 188.88 | 35.81 | 100.09 | 28.97 | 21.21 | 39.36 | 37.74 | 19.40 | |
20 | MAE | 77.34 | 9.12 | 19.79 | 4.31 | 3.39 | 7.96 | 5.77 | 3.27 |
RMSE | 185.29 | 43.75 | 97.88 | 29.65 | 26.48 | 53.90 | 37.34 | 24.26 | |
50 | MAE | 114.93 | 24.31 | 20.98 | 5.88 | 4.41 | 11.35 | 6.62 | 4.28 |
RMSE | 256.64 | 79.73 | 101.36 | 38.76 | 32.37 | 65.57 | 42.20 | 31.08 | |
90 | MAE | 165.42 | 67.90 | 50.01 | 11.33 | 9.07 | 29.38 | 15.71 | 8.71 |
RMSE | 375.36 | 191.95 | 156.47 | 58.56 | 50.57 | 117.72 | 82.01 | 47.58 |
Tab. 6 Experimental results of missing value imputation on Weather dataset
缺失值填充比例/% | 指标 | Median | BRITS | GAIN | SAITS | CSDI | SSSD | PriSTI | I2TDM |
---|---|---|---|---|---|---|---|---|---|
10 | MAE | 66.23 | 6.61 | 20.37 | 3.96 | 3.02 | 6.67 | 5.68 | 2.87 |
RMSE | 188.88 | 35.81 | 100.09 | 28.97 | 21.21 | 39.36 | 37.74 | 19.40 | |
20 | MAE | 77.34 | 9.12 | 19.79 | 4.31 | 3.39 | 7.96 | 5.77 | 3.27 |
RMSE | 185.29 | 43.75 | 97.88 | 29.65 | 26.48 | 53.90 | 37.34 | 24.26 | |
50 | MAE | 114.93 | 24.31 | 20.98 | 5.88 | 4.41 | 11.35 | 6.62 | 4.28 |
RMSE | 256.64 | 79.73 | 101.36 | 38.76 | 32.37 | 65.57 | 42.20 | 31.08 | |
90 | MAE | 165.42 | 67.90 | 50.01 | 11.33 | 9.07 | 29.38 | 15.71 | 8.71 |
RMSE | 375.36 | 191.95 | 156.47 | 58.56 | 50.57 | 117.72 | 82.01 | 47.58 |
缺失率/% | 指标 | No TAM | No ISM | I2TDM |
---|---|---|---|---|
10 | MAE | 8.38 | 6.84 | 6.83 |
RMSE | 16.17 | 12.23 | 12.12 | |
50 | MAE | 12.85 | 8.45 | 8.40 |
RMSE | 25.28 | 15.15 | 15.01 | |
90 | MAE | 38.22 | 14.22 | 14.12 |
RMSE | 61.43 | 24.80 | 24.74 |
Tab. 7 Ablation experiment results
缺失率/% | 指标 | No TAM | No ISM | I2TDM |
---|---|---|---|---|
10 | MAE | 8.38 | 6.84 | 6.83 |
RMSE | 16.17 | 12.23 | 12.12 | |
50 | MAE | 12.85 | 8.45 | 8.40 |
RMSE | 25.28 | 15.15 | 15.01 | |
90 | MAE | 38.22 | 14.22 | 14.12 |
RMSE | 61.43 | 24.80 | 24.74 |
[1] | DU W, CÔTÉ D, BARBER C, et al. Forecasting loss of signal in optical networks with machine learning[J]. Journal of Optical Communications and Networking, 2021, 13(10): E109-E121. |
[2] | SILVA I, MOODY G, SCOTT D J, et al. Predicting in-hospital mortality of ICU patients: the PhysioNet/Computing in cardiology challenge 2012[C]// Proceedings of the 2012 Computing in Cardiology. Piscataway: IEEE, 2012: 245-248. |
[3] | YI X, ZHENG Y, ZHANG J, et al. ST-MVL: filling missing values in geo-sensory time series data[C]// Proceedings of the 25th International Joint Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2016: 2704-2710. |
[4] | HO J, JAIN A, ABBEEL P. Denoising diffusion probabilistic models[C]// Proceedings of the 34th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2020: 6840-6851. |
[5] | LUGMAYR A, DANELLJAN M, ROMERO A, et al. RePaint: inpainting using denoising diffusion probabilistic models[C]// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 11451-11461. |
[6] | XIA B, ZHANG Y, WANG S, et al. DiffIR: efficient diffusion model for image restoration[C]// Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2023: 13049-13059. |
[7] | ROMBACH R, BLATTMANN A, LORENZ D, et al. High-resolution image synthesis with latent diffusion models[C]// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 10674-10685. |
[8] | 刘泽润,尹宇飞,薛文灏,等. 基于扩散模型的条件引导图像生成综述[J]. 浙江大学学报(理学版), 2023, 50(6):651-667. |
LIU Z R, YIN Y F, XUE W H, et al. A review of conditional image generation based on diffusion models[J]. Journal of Zhejiang University (Science Edition), 2023, 50(6): 651-667. | |
[9] | KONG Z, PING W, HUANG J, et al. DiffWave: a versatile diffusion model for audio synthesis[EB/OL]. [2024-06-11].. |
[10] | DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Stroudsburg: ACL, 2019: 4171-4186. |
[11] | WHITE I R, ROYSTON P, WOOD A M. Multiple imputation using chained equations: issues and guidance for practice[J]. Statistics in Medicine, 2011, 30(4): 377-399. |
[12] | BATISTA G E A P A, MONARD M C. A study of k-nearest neighbour as an imputation method[C]// Proceedings of the 2nd International Conference on Hybrid Intelligent Systems: Soft Computing Systems — Design, Management and Applications. Amsterdam: IOS Press, 2002: 251-260. |
[13] | STEKHOVEN D J, BÜHLMANN P. MissForest — non-parametric missing value imputation for mixed-type data[J]. Bioinformatics, 2012, 28(1): 112-118. |
[14] | CHE Z, PURUSHOTHAM S, CHO K, et al. Recurrent neural networks for multivariate time series with missing values[J]. Scientific Reports, 2018, 8: No.6085. |
[15] | CAO W, WANG D, LI J, et al. BRITS: bidirectional recurrent imputation for time series[C]// Proceedings of the 32nd International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2018: 6776-6786. |
[16] | DU W, CÔTÉ D, LIU Y. SAITS: self-attention-based imputation for time series[J]. Expert Systems with Applications, 2023, 219: No.119619. |
[17] | YOON J, JORDON J, VAN DER SCHAAR M. GAIN: missing data imputation using generative adversarial nets[C]// Proceedings of the 35th International Conference on Machine Learning. New York: JMLR.org, 2018: 5689-5698. |
[18] | GOODFELLOW I, POUGET-ABADIE J, MIRZA M, et al. Generative adversarial networks[J]. Communications of the ACM, 2020, 63(11): 139-144. |
[19] | OH E, KIM T, JI Y, et al. STING: self-attention based time-series imputation networks using GAN[C]// Proceedings of the 2021 IEEE International Conference on Data Mining. Piscataway: IEEE, 2021: 1264-1269. |
[20] | TASHIRO Y, SONG J, SONG Y, et al. CSDI: conditional score-based diffusion models for probabilistic time series imputation[C]// Proceedings of the 35th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2021: 24804-24816. |
[21] | ALCARAZ J L, STRODTHOFF N. Diffusion-based time series imputation and forecasting with structured state space models[EB/OL]. [2024-06-28].. |
[22] | LIU M, HUANG H, FENG H, et al. PriSTI: a conditional diffusion framework for spatiotemporal imputation[C]// Proceedings of the IEEE 39th International Conference on Data Engineering. Piscataway: IEEE, 2023: 1927-1939. |
[23] | DAI Z, GETZEN E, LONG Q. SADI: similarity-aware diffusion model-based imputation for incomplete temporal EHR data[C]// Proceedings of the 27th International Conference on Artificial Intelligence and Statistics. New York: JMLR.org, 2024: 4195-4203. |
[24] | TAN C, GAO Z, WU L, et al. Temporal attention unit: towards efficient spatiotemporal predictive learning[C]// Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2023: 18770-18782. |
[25] | ZHANG S, GUO B, DONG A, et al. Cautionary tales on air-quality improvement in Beijing[J]. Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, 2017, 473(2205): No.20170457. |
[26] | WU H, XU J, WANG J, et al. Autoformer: decomposition transformers with auto-correlation for long-term series forecasting[C]// Proceedings of the 35th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2021: 22419-22430. |
[1] | Huibin WANG, Zhan’ao HU, Jie HU, Yuanwei XU, Bo WEN. Time series forecasting model based on segmented attention mechanism [J]. Journal of Computer Applications, 2025, 45(7): 2262-2268. |
[2] | Longbo YAN, Wentao MAO, Zhihong ZHONG, Lilin FAN. Robust unsupervised multi-task anomaly detection method for defect diagnosis of urban drainage pipe network [J]. Journal of Computer Applications, 2025, 45(6): 1833-1840. |
[3] | Lanhao LI, Haojun YAN, Haoyi ZHOU, Qingyun SUN, Jianxin LI. Multi-scale information fusion time series long-term forecasting model based on neural network [J]. Journal of Computer Applications, 2025, 45(6): 1776-1783. |
[4] | Guangju YANG, Tianjian LUO, Kaijun WANG, Siqi YANG. Multi-branch multi-view based contextual contrastive representation learning method for time series [J]. Journal of Computer Applications, 2025, 45(4): 1042-1052. |
[5] | Qiang LI, Shaoxiong BAI, Yuan XIONG, Wei YUAN. Privacy preserving localization of surveillance images based on large vision models [J]. Journal of Computer Applications, 2025, 45(3): 832-839. |
[6] | Yan LI, Guanhua YE, Yawen LI, Meiyu LIANG. Enterprise ESG indicator prediction model based on richness coordination technology [J]. Journal of Computer Applications, 2025, 45(2): 670-676. |
[7] | Qianting ZHANG, Liying HU, Lifei CHEN. Robust shapelet representation method for time series [J]. Journal of Computer Applications, 2025, 45(2): 436-443. |
[8] | Hanlin ZHANG, Junlu WANG, Baoyan SONG. Time series event classification method fused with derived features [J]. Journal of Computer Applications, 2025, 45(2): 428-435. |
[9] | Jianpeng HU, Lichen ZHANG. Deep spatio-temporal network model for multi-time step wind power prediction [J]. Journal of Computer Applications, 2025, 45(1): 98-105. |
[10] | Zijun MIAO, Fei LUO, Weichao DING, Wenbo DONG. Traffic signal control algorithm based on overall state prediction and fair experience replay [J]. Journal of Computer Applications, 2025, 45(1): 337-344. |
[11] | Siqi ZHANG, Jinjun ZHANG, Tianyi WANG, Xiaolin QIN. Deep temporal event detection algorithm based on signal temporal logic [J]. Journal of Computer Applications, 2025, 45(1): 90-97. |
[12] | Qinzhuang ZHAO, Hongye TAN. Time series causal inference method based on adaptive threshold learning [J]. Journal of Computer Applications, 2024, 44(9): 2660-2666. |
[13] | Liting LI, Bei HUA, Ruozhou HE, Kuang XU. Multivariate time series prediction model based on decoupled attention mechanism [J]. Journal of Computer Applications, 2024, 44(9): 2732-2738. |
[14] | Lilin FAN, Fukang CAO, Wanting WANG, Kai YANG, Zhaoyu SONG. Intermittent demand forecasting method based on adaptive matching of demand patterns [J]. Journal of Computer Applications, 2024, 44(9): 2747-2755. |
[15] | Chenyang LI, Long ZHANG, Qiusheng ZHENG, Shaohua QIAN. Multivariate controllable text generation based on diffusion sequences [J]. Journal of Computer Applications, 2024, 44(8): 2414-2420. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||