Panoramic video super-resolution network combining spherical alignment and adaptive geometric correction

doi:10.11772/j.issn.1001-9081.2025030311

Journal of Computer Applications ›› 2026, Vol. 46 ›› Issue (2): 528-535.DOI: 10.11772/j.issn.1001-9081.2025030311

• Multimedia computing and computer simulation • Previous Articles

Panoramic video super-resolution network combining spherical alignment and adaptive geometric correction

Xiaolei CHEN(), Zhiwei ZHENG, Xue HUANG, Zhenbin QU

College of Electrical and Information Engineering，Lanzhou University of Technology，Lanzhou Gansu 730050，China

Received:2025-03-25 Revised:2025-04-25 Accepted:2025-04-27 Online:2025-05-07 Published:2026-02-10
Contact: Xiaolei CHEN
About author:CHEN Xiaolei， born in 1979， Ph. D.， professor. His research interests include artificial intelligence， computer vision. Email:chenxl703@lut.edu.cn
ZHENG Zhiwei， born in 1998， M. S. candidate. Her research interests include computer vision.
HUANG Xue， born in 2000， M. S. candidate. Her research interests include computer vision.
QU Zhenbin， born in 2001， M. S. candidate. His research interests include computer vision.
Supported by:
National Natural Science Foundation of China(61967012)

结合球面对齐与自适应几何校正的全景视频超分辨率网络

陈晓雷(), 郑芷薇, 黄雪, 曲振彬

兰州理工大学电气工程与信息工程学院，兰州 730050

通讯作者: 陈晓雷
作者简介:陈晓雷（1979—），男，河南灵宝人，教授，博士，CCF会员，主要研究方向：人工智能、计算机视觉 Email:chenxl703@lut.edu.cn
郑芷薇（1998—），女，陕西汉中人，硕士研究生，主要研究方向：计算机视觉
黄雪（2000—），女，江苏徐州人，硕士研究生，主要研究方向：计算机视觉
曲振彬（2001—），男，山西忻州人，硕士研究生，主要研究方向：计算机视觉。
基金资助:
国家自然科学基金资助项目(61967012)

Abstract

Abstract:

Traditional Video Super-Resolution （VSR） methods are ineffective in solving geometric distortion problems caused by equirectangular projection when processing panoramic videos， and have deficiencies in inter-frame alignment and feature fusion， which results in poor reconstruction quality. To further improve the super-resolution reconstruction quality of panoramic videos， a panoramic video super-resolution network combining spherical alignment and adaptive geometric correction， named 360GeoVSR， was proposed. In the network， accurate alignment and efficient fusion of inter-frame features were achieved through a Spherical Alignment Module （SAM） and a Geometric Fusion Block （GFB）. In SAM， spatial transformation and deformable convolution were combined to address global and local geometric distortions. In GFB， feature alignment was corrected dynamically using an embedded Adaptive Geometric Correction （AGC） submodule， and multi-frame information was fused to capture complex inter-frame relationships. The results of subjective and objective comparison experiments on the extended ODV360Extended panoramic video dataset show that 360GeoVSR outperforms five representative super-resolution methods， including BasicVSR++ and VRT （Video Restoration Transformer）， in both objective metrics and subjective visual effects， verifying its effectiveness.

Key words: panoramic video, super-resolution, geometric correction, spherical alignment, deep learning

摘要：

现有的传统视频超分辨率（VSR）方法在处理全景视频时难以有效解决等距矩形投影带来的几何畸变问题，且在帧间对齐和特征融合方面存在不足，这些会导致重建效果不佳。为了进一步提升全景视频的超分辨率重建质量，提出一种结合球面对齐与自适应几何校正的全景视频超分辨率网络——360GeoVSR。该网络通过球面对齐模块（SAM）和几何融合块（GFB）实现帧间特征的精确对齐与高效融合。SAM结合空间变换和可变形卷积处理全局与局部几何畸变；GFB通过嵌入的自适应几何校正（AGC）子模块动态校正特征对齐，并融合多帧信息以捕捉复杂的帧间关系。在扩展的ODV360Extended全景视频数据集上的主客观对比实验结果表明，360GeoVSR在客观指标和主观视觉效果上均优于BasicVSR++和VRT（Video Restoration Transformer）等5种代表性超分辨率方法，验证了它的有效性。

关键词: 全景视频, 超分辨率, 几何校正, 球面对齐, 深度学习

CLC Number:

TP391.4

Xiaolei CHEN, Zhiwei ZHENG, Xue HUANG, Zhenbin QU. Panoramic video super-resolution network combining spherical alignment and adaptive geometric correction[J]. Journal of Computer Applications, 2026, 46(2): 528-535.

陈晓雷, 郑芷薇, 黄雪, 曲振彬. 结合球面对齐与自适应几何校正的全景视频超分辨率网络[J]. 《计算机应用》唯一官方网站, 2026, 46(2): 528-535.

Figures/Tables 8

References 34

[1]	卞鹏程，郑忠龙，李明禄，等. 基于注意力融合网络的视频超分辨率重建［J］. 计算机应用， 2021， 41（4）： 1012-1019.
	BIAN P C， ZHENG Z L， LI M L， et al. Attention fusion network based video super-resolution reconstruction［J］. Journal of Computer Applications， 2021， 41（4）： 1012-1019.
[2]	刘扬，刘蓉，方可，等. 基于帧间跨越光流的视频超分辨率重建网络［J］. 计算机应用， 2024， 44（4）： 1277-1284.
	LIU Y， LIU R， FANG K， et al. Video super-resolution reconstruction network based on frame straddling optical flow［J］. Journal of Computer Applications， 2024， 44（4）： 1277-1284.
[3]	PI H， TIAN S， LU M， et al. A comprehensive comparison of projections in omnidirectional super-resolution［C］// Proceedings of the 2023 IEEE International Conference on Acoustics， Speech and Signal Processing. Piscataway： IEEE， 2023： 1-5.
[4]	DOSOVITSKIY A， FISCHER P， ILG E， et al. FlowNet： learning optical flow with convolutional networks［C］// Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2015： 2758-2766.
[5]	DAI J， QI H， XIONG Y， et al. Deformable convolutional networks［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017： 764-773.
[6]	CHAN K C K， WANG X， YU K， et al. Understanding deformable alignment in video super-resolution［C］// Proceedings of the 35th AAAI Conference on Artificial Intelligence. Palo Alto： AAAI Press， 2021： 973-981.
[7]	LUCAS B D， KANADE T. An iterative image registration technique with an application to stereo vision［C］// Proceedings of the 7th International Joint Conference on Artificial Intelligence. San Francisco： Morgan Kaufmann Publishers Inc.， 1981： 674-679.
[8]	CHAN K C K， WANG X， YU K， et al. BasicVSR： the search for essential components in video super-resolution and beyond［C］// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2021： 4945-4954.
[9]	LIU C， YANG H， FU J， et al. Learning trajectory-aware transformer for video super-resolution［C］// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2022： 5677-5686.
[10]	CHEN Y H， CHEN S C， CHEN Y H， et al. MoTIF： learning motion trajectories with local implicit neural functions for continuous space-time video super-resolution［C］// Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2023： 23074-23084.
[11]	SHI S， GU J， XIE L， et al. Rethinking alignment in video super-resolution Transformers［C］// Proceedings of the 36th International Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2022： 36081-36093.
[12]	WANG X， CHAN K C K， YU K， et al. EDVR： video restoration with enhanced deformable convolutional networks［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Piscataway： IEEE， 2019： 1954-1963.
[13]	CHAN K C K， ZHOU S， XU X， et al. BasicVSR++： improving video super-resolution with enhanced propagation and alignment［C］// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2022： 5962-5971.
[14]	JO Y， OH S W， KANG J， et al. Deep video super-resolution network using dynamic upsampling filters without explicit motion compensation［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 3224-3232.
[15]	YI P， WANG Z， JIANG K， et al. Progressive fusion video super-resolution network via exploiting non-local spatio-temporal correlations［C］// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2019： 3106-3115.
[16]	LI W， TAO X， GUO T， et al. MuCAN： multi-correspondence aggregation network for video super-resolution［C］// Proceedings of the 2020 European Conference on Computer Vision， LNCS 12355. Cham： Springer， 2020： 335-351.
[17]	ISOBE T， JIA X， TAO X， et al. Look back and forth： video super-resolution with explicit temporal difference modeling［C］// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2022： 17390-17399.
[18]	LI F， ZHANG L， LIU Z， et al. Multi-frequency representation enhancement with privilege information for video super-resolution［C］// Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2023： 12768-12779.
[19]	GENG Z， LIANG L， DING T， et al. RSTT： real-time spatial temporal transformer for space-time video super-resolution［C］// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2022： 17420-17430.
[20]	ZHOU X， ZHANG L， ZHAO X， et al. Video super-resolution Transformer with masked inter&intra-frame attention［C］// Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2024： 25399-25408.
[21]	YI P， WANG Z， JIANG K， et al. Omniscient video super-resolution［C］// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2021： 4409-4418.
[22]	LIANG J， CAO J， FAN Y， et al. VRT： a video restoration Transformer［J］. IEEE Transactions on Image Processing， 2024， 33： 2171-2182.
[23]	LIU H， MA W， RUAN Z， et al. A single frame and multi-frame joint network for 360-degree panorama video super-resolution［J］. Engineering Applications of Artificial Intelligence， 2024， 134： No.108601.
[24]	LI N， LIU Y. Applying VertexShuffle toward 360-degree video super-resolution［C］// Proceedings of the 32nd Workshop on Network and Operating Systems Support for Digital Audio and Video. New York： ACM， 2022： 71-77.
[25]	LI N， LIU Y. Vertexshuffle-based spherical super-resolution for 360-degree videos［J］. ACM Transactions on Multimedia Computing， Communications and Applications， 2024， 20（9）： No.266.
[26]	AGRAHARI BANIYA A， LEE T K， EKLUND P W， et al. Omnidirectional video super-resolution using deep learning［J］. IEEE Transactions on Multimedia， 2024， 26： 540-554.
[27]	SHANG F， LIU H， MA W， et al. Lightweight super-resolution with self-calibrated convolution for panoramic videos［J］. Sensors， 2023， 23（1）： No.392.
[28]	LUO Z， CHAI B， WANG Z， et al. Masked360： enabling robust 360-degree video streaming with ultra low bandwidth consumption［J］. IEEE Transactions on Visualization and Computer Graphics， 2023， 29（5）： 2690-2699.
[29]	陈晓雷，曹宝宁，卢禹冰，等. 联合视口预测和超分辨率重建的实时VR全景视频流式传输方法［J］. 计算机辅助设计与图形学学报， 2024， 36（9）： 1394-1406.
	CHEN X L， CAO B N， LU Y B， et al. Live VR panoramic video streaming method combining viewport prediction and super-resolution［J］. Journal of Computer-Aided Design and Computer Graphics， 2024， 36（9）： 1394-1406.
[30]	CAO M， MOU C， YU F， et al. NTIRE 2023 challenge on 360° omnidirectional image and video super-resolution： datasets， methods and results［C］// Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Piscataway： IEEE， 2023： 1731-1745.
[31]	SUN Y， LU A， YU L. Weighted-to-spherically-uniform quality evaluation for omnidirectional video［J］. IEEE Signal Processing Letters， 2017， 24（9）： 1408-1412.
[32]	ZHOU Y， YU M， MA H， et al. Weighted-to-spherically-uniform SSIM objective quality evaluation for panoramic video［C］// Proceedings of the 14th IEEE International Conference on Signal Processing. Piscataway： IEEE， 2018： 54-57.
[33]	BARRON J T. A general and adaptive robust loss function［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2019： 4326-4334.
[34]	KINGMA D P， BA J L. Adam： a method for stochastic optimization［EB/OL］. ［2025-03-03］..

SAM	GFB	AGC	PSNR/dB	SSIM	WS-PSNR/dB	WS-SSIM
×	×	×	34.45	0.935 9	33.44	0.932 0
×	√	×	36.06	0.948 4	35.05	0.945 4
√	√	×	36.14	0.948 6	35.10	0.945 5
×	√	√	36.27	0.949 7	35.25	0.946 8
√	√	√	36.42	0.950 6	35.37	0.947 5

SAM	GFB	AGC	PSNR/dB	SSIM	WS-PSNR/dB	WS-SSIM
×	×	×	34.45	0.935 9	33.44	0.932 0
×	√	×	36.06	0.948 4	35.05	0.945 4
√	√	×	36.14	0.948 6	35.10	0.945 5
×	√	√	36.27	0.949 7	35.25	0.946 8
√	√	√	36.42	0.950 6	35.37	0.947 5

GFB数（N_A）	GFB数（N_F）	PSNR/dB	SSIM	WS-PSNR/dB	WS-SSIM
4	4	37.05	0.956 0	36.10	0.953 0
4	6	37.10	0.956 5	36.15	0.953 5
4	8	37.12	0.956 8	36.18	0.953 8
6	4	37.15	0.957 0	36.20	0.954 3
6	6	37.20	0.957 5	36.25	0.955 0
6	8	37.22	0.957 8	36.27	0.955 2
8	4	37.25	0.958 0	36.28	0.955 5
8	6	37.30	0.958 5	36.29	0.956 1
8	8	37.28	0.958 3	36.28	0.955 8

GFB数（N_A）	GFB数（N_F）	PSNR/dB	SSIM	WS-PSNR/dB	WS-SSIM
4	4	37.05	0.956 0	36.10	0.953 0
4	6	37.10	0.956 5	36.15	0.953 5
4	8	37.12	0.956 8	36.18	0.953 8
6	4	37.15	0.957 0	36.20	0.954 3
6	6	37.20	0.957 5	36.25	0.955 0
6	8	37.22	0.957 8	36.27	0.955 2
8	4	37.25	0.958 0	36.28	0.955 5
8	6	37.30	0.958 5	36.29	0.956 1
8	8	37.28	0.958 3	36.28	0.955 8

方法	处理帧数	参数量/10⁶	PSNR/dB	SSIM	WS-PSNR/dB	WS-SSIM
Bicubic	—	—	32.64	0.921 1	31.79	0.914 2
EDVR^［12］	5	20.6	33.60	0.949 6	32.69	0.939 7
BasicVSR^［8］	30	6.3	35.73	0.946 3	34.61	0.942 3
GOVSR^［21］	30	7.0	35.78	0.946 9	34.61	0.943 1
BasicVSR++^［13］	30	7.3	37.26	0.958 2	36.28	0.955 9
VRT^［22］	6	35.6	37.10	0.957 6	36.10	0.954 1
360GeoVSR	30	19.9	37.30	0.958 5	36.29	0.956 1

Panoramic video super-resolution network combining spherical alignment and adaptive geometric correction

结合球面对齐与自适应几何校正的全景视频超分辨率网络

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 8

References 34

Related Articles 15

Recommended Articles

Metrics

[1]	Jinjiao LIN, Canshun ZHANG, Shuya CHEN, Tianxin WANG, Jian LIAN, Yonghui XU. Vehicle insurance fraud detection method based on improved graph attention network [J]. Journal of Computer Applications, 2026, 46(2): 437-444.
[2]	Haoqian JIANG, Dong ZHANG, Guanyu LI, Heng CHEN. SetaCRS： Conversational recommender system with structure-enhanced hierarchical task-oriented prompting strategy [J]. Journal of Computer Applications, 2026, 46(2): 368-377.
[3]	Ming LI, Mengqi WANG, Aili ZHANG, Hua REN, Yuqiang DOU. Image steganography method based on conditional generative adversarial networks and hybrid attention mechanism [J]. Journal of Computer Applications, 2026, 46(2): 475-484.
[4]	Zeyi GUO, Fenglian LI, Lichun XU. Double decision mechanism-based deep symbolic regression algorithm [J]. Journal of Computer Applications, 2026, 46(2): 406-415.
[5]	Xiaoyong BIAN, Peiyang YUAN, Qiren HU. Dual-coding space-frequency mixing method for infrared small target detection [J]. Journal of Computer Applications, 2026, 46(1): 252-259.
[6]	Zhihui ZAN, Yajing WANG, Ke LI, Zhixiang YANG, Guangyu YANG. Multi-feature fusion speech emotion recognition method based on SAA-CNN-BiLSTM network [J]. Journal of Computer Applications, 2026, 46(1): 69-76.
[7]	Hongjun ZHANG, Gaojun PAN, Hao YE, Yubin LU, Yiheng MIAO. Multi-source heterogeneous data analysis method combining deep learning and tensor decomposition [J]. Journal of Computer Applications, 2025, 45(9): 2838-2847.
[8]	Jin LI, Liqun LIU. SAR and visible image fusion based on residual Swin Transformer [J]. Journal of Computer Applications, 2025, 45(9): 2949-2956.
[9]	Bing YIN, Zhenhua LING, Yin LIN, Changfeng XI, Ying LIU. Emotion recognition method compatible with missing modal reasoning [J]. Journal of Computer Applications, 2025, 45(9): 2764-2772.
[10]	Panfeng JING, Yudong LIANG, Chaowei LI, Junru GUO, Jinyu GUO. Semi-supervised image dehazing algorithm based on teacher-student learning [J]. Journal of Computer Applications, 2025, 45(9): 2975-2983.
[11]	Weigang LI, Jiale SHAO, Zhiqiang TIAN. Point cloud classification and segmentation network based on dual attention mechanism and multi-scale fusion [J]. Journal of Computer Applications, 2025, 45(9): 3003-3010.
[12]	Zhixiong XU, Bo LI, Xiaoyong BIAN, Qiren HU. Adversarial sample embedded attention U-Net for 3D medical image segmentation [J]. Journal of Computer Applications, 2025, 45(9): 3011-3016.
[13]	Lina GE, Mingyu WANG, Lei TIAN. Review of research on efficiency of federated learning [J]. Journal of Computer Applications, 2025, 45(8): 2387-2398.
[14]	Peng PENG, Ziting CAI, Wenling LIU, Caihua CHEN, Wei ZENG, Baolai HUANG. Speech emotion recognition method based on hybrid Siamese network with CNN and bidirectional GRU [J]. Journal of Computer Applications, 2025, 45(8): 2515-2521.
[15]	Shuo ZHANG, Guokai SUN, Yuan ZHUANG, Xiaoyu FENG, Jingzhi WANG. Dynamic detection method of eclipse attacks for blockchain node analysis [J]. Journal of Computer Applications, 2025, 45(8): 2428-2436.