Video snapshot compressive imaging reconstruction method based on dense spatio-temporal deformable attention

doi:10.11772/j.issn.1001-9081.2025060699

Journal of Computer Applications

Received:2025-06-24 Revised:2025-09-28 Online:2025-10-23 Published:2025-10-23

基于密集时空可变形注意力的视频快照压缩成像重建方法

杜秀丽¹,高星²,张校毓¹,潘成胜³,⁴,邹启杰⁵

1. 大连大学信息工程学院辽宁省通信网络与信息处理重点实验室
2. 大连大学
3. 大连大学信息工程学院，辽宁大连 116622
4. 辽宁省高校通信与信息处理重点实验室(大连大学)，辽宁大连 116622
5. 大连大学信息学院

通讯作者: 高星
基金资助:
辽宁省教育厅项目

Abstract

Abstract: Abstract: Deep learning-based reconstruction algorithms for Video Snapshot Compressive Imaging (VSCI) have achieved promising results in many tasks. However, challenges such as insufficient detail recovery and high computational overhead remain in dynamic scene reconstruction. To address these issues, a video snapshot compressive imaging reconstruction method based on dense spatio-temporal deformable attention was proposed. Compressed measurements and masks were first used to obtain initial feature representations. Then, a deformable attention module was designed to effectively enhance the model’s ability to capture local spatial deformations and global temporal dependencies. Finally, improvements were made to the dense connectivity structure by introducing a channel splitting factor to enable dynamic group-wise progressive feature extraction and fusion, thereby improving feature representation and reducing reconstruction time. Experimental results on multiple simulated grayscale and color video datasets demonstrate that, compared to suboptimal reference methods, Peak Signal-to-Noise Ratio (PSNR) improvements of 0.45 dB and 0.79 dB were achieved, respectively, while the reconstruction time was reduced to 0.43 seconds. These results indicate that DeT-SCI substantially improves reconstruction quality and computational efficiency in complex motion scenarios.

Key words: Video Snapshot Compressive Imaging (VSCI), compressed sensing, deep learning, Deformable Convolutional Network, self-attention mechanism

摘要： 基于深度学习的视频快照压缩成像（VSCI）重建算法在多数任务中已取得良好效果，但仍面临动态场景重建中细节恢复不足及计算开销大等问题。针对这些问题，本文提出一种基于密集时空可变形注意力的视频快照压缩成像重建方法。首先，将压缩测量值与掩码输入获取初始特征信息；其次设计了一个可变形注意力模块，有效增强模型对局部形变与全局时序依赖的捕捉能力；最后对密集连接做出改进，设计通道分割因子实现动态分组递进的特征提取与融合，从而提升特征提取能力和减少重建时间。在多个模拟灰度和彩色视频数据集上的实验结果表明，与次优对比方法相比峰值信噪比（PSNR）值分别提升了0.45dB和0.79dB，重建时间仅需0.43秒，说明所提方法显著提升了复杂运动场景下的重建质量与计算效率。

关键词: 视频快照压缩成像, 压缩感知, 深度学习, 可变形卷积网络, 自注意力机制

CLC Number:

TP391.41

杜秀丽高星张校毓潘成胜邹启杰. 基于密集时空可变形注意力的视频快照压缩成像重建方法[J]. 《计算机应用》唯一官方网站, DOI: 10.11772/j.issn.1001-9081.2025060699.

[1]	Weigang LI, Jiale SHAO, Zhiqiang TIAN. Point cloud classification and segmentation network based on dual attention mechanism and multi-scale fusion [J]. Journal of Computer Applications, 2025, 45(9): 3003-3010.
[2]	Xiang WANG, Zhixiang CHEN, Guojun MAO. Multivariate time series prediction method combining local and global correlation [J]. Journal of Computer Applications, 2025, 45(9): 2806-2816.
[3]	Zhixiong XU, Bo LI, Xiaoyong BIAN, Qiren HU. Adversarial sample embedded attention U-Net for 3D medical image segmentation [J]. Journal of Computer Applications, 2025, 45(9): 3011-3016.
[4]	Panfeng JING, Yudong LIANG, Chaowei LI, Junru GUO, Jinyu GUO. Semi-supervised image dehazing algorithm based on teacher-student learning [J]. Journal of Computer Applications, 2025, 45(9): 2975-2983.
[5]	Hongjun ZHANG, Gaojun PAN, Hao YE, Yubin LU, Yiheng MIAO. Multi-source heterogeneous data analysis method combining deep learning and tensor decomposition [J]. Journal of Computer Applications, 2025, 45(9): 2838-2847.
[6]	Jin LI, Liqun LIU. SAR and visible image fusion based on residual Swin Transformer [J]. Journal of Computer Applications, 2025, 45(9): 2949-2956.
[7]	Bing YIN, Zhenhua LING, Yin LIN, Changfeng XI, Ying LIU. Emotion recognition method compatible with missing modal reasoning [J]. Journal of Computer Applications, 2025, 45(9): 2764-2772.
[8]	Yilin DENG, Fajiang YU. Pseudo random number generator based on LSTM and separable self-attention mechanism [J]. Journal of Computer Applications, 2025, 45(9): 2893-2901.
[9]	Peng PENG, Ziting CAI, Wenling LIU, Caihua CHEN, Wei ZENG, Baolai HUANG. Speech emotion recognition method based on hybrid Siamese network with CNN and bidirectional GRU [J]. Journal of Computer Applications, 2025, 45(8): 2515-2521.
[10]	Shuo ZHANG, Guokai SUN, Yuan ZHUANG, Xiaoyu FENG, Jingzhi WANG. Dynamic detection method of eclipse attacks for blockchain node analysis [J]. Journal of Computer Applications, 2025, 45(8): 2428-2436.
[11]	Lina GE, Mingyu WANG, Lei TIAN. Review of research on efficiency of federated learning [J]. Journal of Computer Applications, 2025, 45(8): 2387-2398.
[12]	Yanhua LIAO, Yuanxia YAN, Wenlin PAN. Multi-target detection algorithm for traffic intersection images based on YOLOv9 [J]. Journal of Computer Applications, 2025, 45(8): 2555-2565.
[13]	Chen LIANG, Yisen WANG, Qiang WEI, Jiang DU. Source code vulnerability detection method based on Transformer-GCN [J]. Journal of Computer Applications, 2025, 45(7): 2296-2303.
[14]	Jinxian SUO, Liping ZHANG, Sheng YAN, Dongqi WANG, Yawen ZHANG. Review of interpretable deep knowledge tracing methods [J]. Journal of Computer Applications, 2025, 45(7): 2043-2055.
[15]	Zhenzhou WANG, Fangfang GUO, Jingfang SU, He SU, Jianchao WANG. Robustness optimization method of visual model for intelligent inspection [J]. Journal of Computer Applications, 2025, 45(7): 2361-2368.