Journal of Computer Applications

    Next Articles

Video snapshot compressive imaging reconstruction method based on dense spatio-temporal deformable attention

  

  • Received:2025-06-24 Revised:2025-09-28 Online:2025-10-23 Published:2025-10-23

基于密集时空可变形注意力的视频快照压缩成像重建方法

杜秀丽1,高星2,张校毓1,潘成胜3,4,邹启杰5   

  1. 1. 大连大学信息工程学院 辽宁省通信网络与信息处理重点实验室
    2. 大连大学
    3. 大连大学 信息工程学院,辽宁 大连 116622
    4. 辽宁省高校通信与信息处理重点实验室(大连大学),辽宁 大连 116622
    5. 大连大学 信息学院
  • 通讯作者: 高星
  • 基金资助:
    辽宁省教育厅项目

Abstract: Abstract: Deep learning-based reconstruction algorithms for Video Snapshot Compressive Imaging (VSCI) have achieved promising results in many tasks. However, challenges such as insufficient detail recovery and high computational overhead remain in dynamic scene reconstruction. To address these issues, a video snapshot compressive imaging reconstruction method based on dense spatio-temporal deformable attention was proposed. Compressed measurements and masks were first used to obtain initial feature representations. Then, a deformable attention module was designed to effectively enhance the model’s ability to capture local spatial deformations and global temporal dependencies. Finally, improvements were made to the dense connectivity structure by introducing a channel splitting factor to enable dynamic group-wise progressive feature extraction and fusion, thereby improving feature representation and reducing reconstruction time. Experimental results on multiple simulated grayscale and color video datasets demonstrate that, compared to suboptimal reference methods, Peak Signal-to-Noise Ratio (PSNR) improvements of 0.45 dB and 0.79 dB were achieved, respectively, while the reconstruction time was reduced to 0.43 seconds. These results indicate that DeT-SCI substantially improves reconstruction quality and computational efficiency in complex motion scenarios.

Key words: Video Snapshot Compressive Imaging (VSCI), compressed sensing, deep learning, Deformable Convolutional Network, self-attention mechanism

摘要: 基于深度学习的视频快照压缩成像(VSCI)重建算法在多数任务中已取得良好效果,但仍面临动态场景重建中细节恢复不足及计算开销大等问题。针对这些问题,本文提出一种基于密集时空可变形注意力的视频快照压缩成像重建方法。首先,将压缩测量值与掩码输入获取初始特征信息;其次设计了一个可变形注意力模块,有效增强模型对局部形变与全局时序依赖的捕捉能力;最后对密集连接做出改进,设计通道分割因子实现动态分组递进的特征提取与融合,从而提升特征提取能力和减少重建时间。在多个模拟灰度和彩色视频数据集上的实验结果表明,与次优对比方法相比峰值信噪比(PSNR)值分别提升了0.45dB和0.79dB,重建时间仅需0.43秒,说明所提方法显著提升了复杂运动场景下的重建质量与计算效率。

关键词: 视频快照压缩成像, 压缩感知, 深度学习, 可变形卷积网络, 自注意力机制

CLC Number: