基于视觉大模型隐私保护的监控图像定位

doi:10.11772/j.issn.1001-9081.2024101538

《计算机应用》唯一官方网站 ›› 2025, Vol. 45 ›› Issue (3): 832-839.DOI: 10.11772/j.issn.1001-9081.2024101538

• 大模型前沿研究与典型应用 • 上一篇下一篇

基于视觉大模型隐私保护的监控图像定位

李强¹, 白少雄¹, 熊源², 袁薇³()

^1.国能运输技术研究院有限责任公司，北京 100080
^2.中山大学深圳校区网络空间安全学院，广东深圳 518107
^3.中国软件评测中心（工业和信息化部软件与集成电路促进中心），北京 102206

收稿日期:2024-10-31 修回日期:2024-12-02 接受日期:2024-12-04 发布日期:2025-01-06 出版日期:2025-03-10
通讯作者: 袁薇
作者简介:李强（1996—），男，山西神池人，工程师，硕士，主要研究方向：智慧化系统建设、智能装备革新、安全运营
白少雄（1997—），男，河北定州人，硕士，主要研究方向：计算机视觉
熊源（1988—），男，湖北蕲春人，博士，主要研究方向：计算机视觉、计算机图形学、深度学习

Privacy preserving localization of surveillance images based on large vision models

Qiang LI¹, Shaoxiong BAI¹, Yuan XIONG², Wei YUAN³()

^1.China Energy Institute of Transportation Technology Research Company Limited，Beijing 100080，China
^2.School of Cyber Science and Technology，SUN Yat-sen University，Shenzhen，Shenzhen Guangdong 518107，China
^3.China Software Testing Center （Ministry of Industry and Information Technology Software and Integrated Circuit Promotion Center），Beijing 102206，China

Received:2024-10-31 Revised:2024-12-02 Accepted:2024-12-04 Online:2025-01-06 Published:2025-03-10
Contact: Wei YUAN
About author:LI Qiang， born in 1996， M. S.， engineer. His research interests include intelligent system construction， intelligent device update， operation safe.
BAI Shaoxiong， born in 1997， M. S. His research interests include computer vision.
XIONG Yuan， born in 1988， Ph. D. His research interests include computer vision， computer graphics， deep learning.

摘要/Abstract

摘要：

监控图像的视觉定位是工业智能领域的关键技术。针对现有视觉定位算法缺少对图像中隐私信息的保护，在数据传输过程中容易导致敏感内容泄露的问题，提出一种基于视觉大模型（LVM）的监控图像定位方法。首先，设计基于LVM隐私保护的视觉定位架构，以利用少量文本提示和参考图像对输入图像进行风格迁移；其次，提出面向风格迁移图像的特征匹配算法用于相机位姿的估计。在公开数据集上的实验结果表明，所提方法的定位结果误差较小，在保证定位精度的前提下大幅减少了隐私泄露。

关键词: 扩散模型, 监控定位, 视觉大模型, 视觉定位, 隐私保护

Abstract:

Visual localization of surveillance images is an important technology in industrial intelligence. The existing visual localization algorithms lack the protection of the privacy information in the image and may lead to the leakage of sensitive content during data transmission. To address the problem， a localization method of surveillance images based on Large Vision Models （LVMs） was proposed. Firstly， the architecture of LVM privacy preserving-based visual localization was designed to transfer the style of input images by using a few prompts and reference images. Then， a feature matching algorithm for the image with style transfer was designed to estimate the camera pose. Experimental results on public datasets show that the localization error of the proposed algorithm is relatively small， demonstrating that the algorithm reduces the privacy leakage significantly while ensuring the localization accuracy.

Key words: diffusion model, surveillance localization, Large Vision Model (LVM), visual localization, privacy preserving

中图分类号:

TP391.4

李强, 白少雄, 熊源, 袁薇. 基于视觉大模型隐私保护的监控图像定位[J]. 计算机应用, 2025, 45(3): 832-839.

Qiang LI, Shaoxiong BAI, Yuan XIONG, Wei YUAN. Privacy preserving localization of surveillance images based on large vision models[J]. Journal of Computer Applications, 2025, 45(3): 832-839.

图/表 8

参考文献 31

1	TORFT C， MADDERN W， TORII A， et al. Long-term visual localization revisited ［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2022， 44（4）：2074-2088.
2	SATTLER T， TORII A， SIVIC J， et al. Are large-scale 3D models really necessary for accurate visual localization ［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017：1637-1646.
3	TORII A， ARANDJELOVIĆ R， SIVIC J， et al. 24/7 place recognition by view synthesis ［C］// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2015：1808-1817.
4	ARANDJELOVIĆ R， GRONAT P， TORII A， et al. NetVLAD： CNN architecture for weakly supervised place recognition ［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2018， 40（6）： 1437-1451.
5	HAUSLER S， GARG S， XU M， et al. Patch-NetVLAD： multi-scale fusion of locally-global descriptors for place recognition ［C］// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2021：14136-14147.
6	KENDALL A， GRIMES M， CIPOLLA R. PoseNet： a convolutional network for real-time 6-DOF camera relocalization ［C］// Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2015：2938-2946.
7	SATTLER T， ZHOU Q， POLLEFEYS M， et al. Understanding the limitations of CNN-based absolute camera pose regression ［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2019： 3297-3307.
8	QI C R， SU H， MO K， et al. PointNet： deep learning on point sets for 3D classification and segmentation ［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 77-85.
9	QI C R， YI L， SU H， et al. PointNet++： deep hierarchical feature learning on point sets in a metric space ［C］// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2017： 5105-5114.
10	SARLIN P E， CADENA C， SIEGWART R， et al. From coarse to fine： robust hierarchical localization at large scale ［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2019： 12708-12717.
11	SARLIN P E， UNAGAR A， LARSSON M， et al. Back to the feature： learning robust camera localization from pixels to pose［C］// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2021： 3246-3256.
12	PANEK V， KUKELOVA Z， SATTLER T. MeshLoc： mesh-based visual localization ［C］// Proceedings of the 2022 European Conference on Computer Vision， LNCS 13682. Cham： Springer， 2022：589-609.
13	YUAN X， JINGRU W， ZHONG Z. VirtualLoc： large-scale visual localization using virtual images ［J］. ACM Transactions on Multimedia Computing， Communications， and Applications， 2024， 20（3）： No.66.
14	SPECIALE P， SCHÖNBERGER J L， KANG S B， et al. Privacy preserving image-based localization ［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2019： 5488-5498.
15	PIETRANTONI M， HUMENBERGER M， SATTLER T， et al. SegLoc： learning segmentation-based representations for privacy-preserving visual localization ［C］// Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2023： 15380-15391.
16	秦川，董腾林，姚恒. 基于风格迁移纹理合成与识别的构造式信息隐藏［J］. 软件学报， 2023， 34（12）：5773-5786.
	QIN C， DONG T L， YAO H. Constructive data hiding based on texture synthesis and recognition with image style transfer ［J］. Journal of Software， 2023， 34（12）：5773-5786.
17	杨盼，张敏情，葛虞，等. 基于风格迁移过程的彩色图像信息隐藏算法［J］. 计算机应用， 2023， 43（6）：1730-1735.
	YANG P， ZHANG M Q， GE Y， et al. Color image information hiding algorithm based on style transfer process ［J］. Journal of Computer Applications， 2023， 43（6）： 1730-1735.
18	谢艺艺，张玉书，赵若宇，等. 基于CycleGAN的图像隐私保护［J］. 应用科学学报， 2023， 41（2）：228-239.
	XIE Y Y， ZHANG Y S， ZHAO R Y， et al. Image privacy protection based on cycle-consistent generative adversarial networks ［J］. Journal of Applied Sciences， 2023， 41（2）： 228-239.
19	ROMBACH R， BLATTMANN A， LORENZ D， et al. High-resolution image synthesis with latent diffusion models ［C］// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2022： 10674-10685.
20	RADFORD A， KIM J W， HALLACY C， et al. Learning transferable visual models from natural language supervision ［C］// Proceedings of the 38th International Conference on Machine Learning. New York： JMLR.org， 2021： 8748-8763.
21	ZHANG L， RAO A， AGRAWALA M. Adding conditional control to text-to-image diffusion models ［C］// Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2023： 3836-3847.
22	张泽宇，王铁君，郭晓然，等. AI绘画研究综述［J］. 计算机科学与探索， 2024， 18（6）：1404-1420.
	ZHANG Z Y， WANG T J， GUO X R， et al. Survey of AI painting ［J］. Journal of Frontiers of Computer Science and Technology， 2024， 18（6）： 1404-1420.
23	艾浩军，曾维珂，陶荆杰，等. 基于扩散模型的室内定位射频指纹数据增强方法［J］. 通信学报， 2023， 44（11）：201-212.
	AI H J， ZENG W K， TAO J J， et al. Radio frequency fingerprint data augmentation for indoor localization based on diffusion model ［J］. Journal on Communications， 2023， 44（11）：201-212.
24	LOWE D G. Distinctive image features from scale-invariant keypoints ［J］. International Journal of Computer Vision， 2004， 60（2）：91-110.
25	ZHOU H， ZHANG T， JAGADEESAN J. Re-weighting and 1-point RANSAC-based PnP solution to handle outliers ［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2019， 41（12）：3022-3033.
26	WANG Z， BOVIK A C， SHEIKH H R， et al. Image quality assessment： from error visibility to structural similarity ［J］. IEEE Transactions on Image Processing， 2004， 13（4）：600-612.
27	DAI A， CHANG A X， SAVVA M， et al. ScanNet： richly-annotated 3D reconstructions of indoor scenes ［C］// Proceedings of the 2017 IEEE Conference on Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 2432-2443.
28	YANG L， KANG B， HUANG Z， et al. Depth anything： unleashing the power of large-scale unlabeled data ［C］// Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2024： 10371-10381.
29	GU G， KO B， GO S， et al. Towards light-weight and real-time line segment detection ［C］// Proceedings of the 36th AAAI Conference on Artificial Intelligence. Palo Alto： AAAI Press， 2022：726-734.
30	XIE S， TU Z. Holistically-nested edge detection ［C］// Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2015： 1395-1403.
31	CANNY J. A computational approach to edge detection ［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 1986， PAMI-8（6）： 679-698.

方法	RMSE	PSNR/dB	SSIM	FID
SD-CN-Normal	98.938	8.377	0.224	286.16
SD-CN-Depth	107.040	7.609	0.193	316.13
SD-CN-LSD	102.090	8.019	0.128	316.05
SD-CN-Edge	77.623	10.446	0.238	339.77
SD-CN-Canny	86.994	9.465	0.216	214.94
SD-CN-Tile	39.421	16.239	0.218	174.55

方法	RMSE	PSNR/dB	SSIM	FID
SD-CN-Normal	98.938	8.377	0.224	286.16
SD-CN-Depth	107.040	7.609	0.193	316.13
SD-CN-LSD	102.090	8.019	0.128	316.05
SD-CN-Edge	77.623	10.446	0.238	339.77
SD-CN-Canny	86.994	9.465	0.216	214.94
SD-CN-Tile	39.421	16.239	0.218	174.55

图像	方法	KPS	RPJE/px	POS/m	ROT/（°）	PSNR/dB
无风格迁移图像	DenseVLAD	40.50	14.660	6.936	3.185	11.158
	MeshLoc	242.60	3.361	0.677	0.342	12.281
	VirtualLoc	284.10	3.120	0.248	0.181	12.476
	本文方法	293.20	3.032	0.241	0.178	12.480
风格迁移图像	DenseVLAD	35.12	63.050	8.993	4.561	11.626
	MeshLoc	88.43	54.910	8.064	4.045	11.717
	VirtualLoc	142.10	46.270	7.949	3.982	11.729
	本文方法	187.30	3.826	1.212	0.601	13.298

图像	方法	KPS	RPJE/px	POS/m	ROT/（°）	PSNR/dB
无风格迁移图像	DenseVLAD	40.50	14.660	6.936	3.185	11.158
	MeshLoc	242.60	3.361	0.677	0.342	12.281
	VirtualLoc	284.10	3.120	0.248	0.181	12.476
	本文方法	293.20	3.032	0.241	0.178	12.480
风格迁移图像	DenseVLAD	35.12	63.050	8.993	4.561	11.626
	MeshLoc	88.43	54.910	8.064	4.045	11.717
	VirtualLoc	142.10	46.270	7.949	3.982	11.729
	本文方法	187.30	3.826	1.212	0.601	13.298

[1]	王宝银, 薛红梅, 刘期烈, 郭涛. 基于隐私保护的随机共识资产跨链方案[J]. 《计算机应用》唯一官方网站, 2025, 45(2): 497-505.
[2]	李晨阳, 张龙, 郑秋生, 钱少华. 基于扩散序列的多元可控文本生成[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2414-2420.
[3]	陈学斌, 任志强, 张宏扬. 联邦学习中的安全威胁与防御措施综述[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1663-1672.
[4]	刘沛骞, 王水莲, 申自浩, 王辉. 基于轨迹扰动和路网匹配的位置隐私保护算法[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1546-1554.
[5]	徐劲松, 朱明, 李智强, 郭世杰. 基于激发和汇聚注意力的扩散模型生成对象的位置控制方法[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1093-1098.
[6]	蔡美玉, 朱润哲, 吴飞, 张开昱, 李家乐. 基于注意力机制和多粒度特征融合的跨视角匹配模型[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 901-908.
[7]	高改梅, 张瑾, 刘春霞, 党伟超, 白尚旺. 基于区块链与CP-ABE策略隐藏的众包测试任务隐私保护方案[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 811-818.
[8]	马海峰, 李玉霞, 薛庆水, 杨家海, 高永福. 用于实现区块链隐私保护的属性基加密方案[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 485-489.
[9]	王伊婷, 万武南, 张仕斌, 张金全, 秦智. 基于SM9算法的可链接环签名方案[J]. 《计算机应用》唯一官方网站, 2024, 44(12): 3709-3716.
[10]	梁静, 万武南, 张仕斌, 张金全, 秦智. 面向主从链的慈善系统溯源存储模型[J]. 《计算机应用》唯一官方网站, 2024, 44(12): 3751-3758.
[11]	高瑞, 陈学斌, 张祖篡. 面向部分图更新的动态社交网络隐私发布方法[J]. 《计算机应用》唯一官方网站, 2024, 44(12): 3831-3838.
[12]	贾淼, 姚中原, 祝卫华, 高婷婷, 斯雪明, 邓翔. 零知识证明赋能区块链的进展与展望[J]. 《计算机应用》唯一官方网站, 2024, 44(12): 3669-3677.
[13]	方鹏, 赵凡, 王保全, 王轶, 蒋同海. 区块链3.0的发展、技术与应用[J]. 《计算机应用》唯一官方网站, 2024, 44(12): 3647-3657.
[14]	王一帆, 林绍福, 李云江. 基于区块链和零知识证明的高速公路自由流收费方法[J]. 《计算机应用》唯一官方网站, 2024, 44(12): 3741-3750.
[15]	刘雨生, 肖学中. 基于扩散模型微调的高保真图像编辑[J]. 《计算机应用》唯一官方网站, 2024, 44(11): 3574-3580.

基于视觉大模型隐私保护的监控图像定位

Privacy preserving localization of surveillance images based on large vision models

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 8

参考文献 31

相关文章 15

编辑推荐

Metrics