基于ARM平台的VR头部追踪系统的设计与实现

doi:10.11772/j.issn.1001-9081.2023081097

《计算机应用》唯一官方网站 ›› 0, Vol. ›› Issue (): 192-200.DOI: 10.11772/j.issn.1001-9081.2023081097

• 多媒体计算与计算机仿真 • 上一篇下一篇

基于ARM平台的VR头部追踪系统的设计与实现

张向军, 黄国书, 邱涛(), 王明, 尹逊刚, 范兵兵

歌尔科技有限公司，山东青岛 266000

收稿日期:2023-08-14 修回日期:2024-03-21 接受日期:2024-03-25 发布日期:2025-01-24 出版日期:2024-12-31
通讯作者: 邱涛
作者简介:张向军（1981—），男，河南安阳人，高级工程师，硕士，CCF会员，主要研究方向：虚拟现实、系统软件
黄国书（1993—），男，台湾台中人，工程师，硕士，主要研究方向：机器视觉、SLAM算法
邱涛（1990—），男，山东青岛人，工程师，硕士，主要研究方向：虚拟现实、系统软件
王明（1987—），男，山东青岛人，高级工程师，硕士，主要研究方向：虚拟现实、系统软件
尹逊刚（1990—），男，山东济南人，工程师，硕士，主要研究方向：三维数据可视化
范兵兵（1998—），男，山西太原人，硕士研究生，主要研究方向：SLAM算法。

Design and implementation of VR head tracking system based on ARM platform

Xiangjun ZHANG, Guoshu HUANG, Tao QIU(), Ming WANG, Xungang YIN, Bingbing FAN

Goertek Company Limited，Qingdao Shandong 266000，China

Received:2023-08-14 Revised:2024-03-21 Accepted:2024-03-25 Online:2025-01-24 Published:2024-12-31
Contact: Tao QIU

摘要/Abstract

摘要：

随着虚拟现实（VR）一体机的蓬勃发展，内向外六自由度（6DOF）的追踪技术已成为主流的头部追踪方案。然而，由于要同时处理大量视觉及惯性测量单元（IMU）数据，计算量较大，在ARM移动平台计算能力有限的情况下，如何同时达到较好的追踪精度和稳定性成为实现该系统的关键技术难点。为此，基于ARM移动平台和开源ORB-SLAM3算法，设计并实现一套追踪精度高、稳定性好的VR 6DOF头部追踪系统。首先，通过高精度机械臂对IMU和Camera进行联合标定，并通过IMU积分信息与VIO（Visual-Inertial Odometry）进行位姿融合优化；其次，结合VR使用场景，调低单帧图像的特征点规模，以降低算法复杂度，达到近1 kHz的6DOF位姿输出频率；最后，提出一种针对视觉跟踪丢失的快速处理策略，以提高追踪状态的稳定性。经过以上优化后的算法运行在高通XR2平台上的实验结果显示，相较于传统的ORB-SLAM3算法，所提算法在相对位置误差（RPE）和相对旋转误差（RRE）指标上分别减小了44.4%和73.4%，在实时性和精度指标上能满足VR一体机需求。

关键词: 虚拟现实, ARM平台, 六自由度, ORB-SLAM3, 惯性测量单元

Abstract:

With the rapid development of Virtual Reality （VR） all-in-one devices， inside-out Six Degrees of Freedom （6DOF） tracking technology has become the mainstream head tracking solution. However， these tracking algorithms need to process large numbers of images and Inertial Measurement Unit （IMU） data simultaneously， which consume a lot of computational resource. When deploying the algorithms on ARM mobile platform， achieving good tracking accuracy and stability with limited computing power has become a key technical difficulty. Therefore， based on ARM mobile platform and ORB-SLAM3 algorithm， a VR 6DOF head tracking system with high tracking accuracy and stability was designed and implemented. Firstly， the joint calibration of IMU and Camera was performed by the high-precision manipulator， and pose fusion was optimized through IMU integration information and VIO （Visual Inertial Odometry）. Then， to adapt to the VR usage scenario， the scale of feature points in a single frame image was decreased to reduce the complexity of the algorithm， achieving nearly 1 kHz the 6DOF pose output frequency. Finally， a fast-response strategy for visual tracking loss was proposed to improve the stability of the tracking state. After the above optimizations， experimental results of the proposed algorithm on Qualcomm XR2 platform show that compared with the traditional ORB-SLAM3 algorithm， the proposed algorithm has the Relative Position Error （RPE） and Relative Rotation Error （RRE） indicators decreased by 44.4% and 73.4% respectively. In terms of real-time performance and precision， the proposed algorithm satisfies the needs of VR all-in-one devices， as demonstrated by the experimental results.

Key words: Virtual Reality (VR), Advanced RISC Machine (ARM) platform, Six Degrees of Freedom (6DOF), Oriented Fast and Rotated Brief-Simultaneous Localization And Mapping Three (ORB-SLAM3), Inertial Measurement Unit (IMU)

中图分类号:

TP391.4

张向军, 黄国书, 邱涛, 王明, 尹逊刚, 范兵兵. 基于ARM平台的VR头部追踪系统的设计与实现[J]. 计算机应用, 0, (): 192-200.

Xiangjun ZHANG, Guoshu HUANG, Tao QIU, Ming WANG, Xungang YIN, Bingbing FAN. Design and implementation of VR head tracking system based on ARM platform[J]. Journal of Computer Applications, 0, (): 192-200.

图/表 35

参考文献 38

1	高翔，张涛，刘毅，等. 视觉SLAM十四讲：从理论到实践［M］. 2版. 北京：电子工业出版社， 2019： 283-303.
2	KHOLE A， THAKAR A， SHENDE S， et al. A comprehensive study on Simultaneous Localization and Mapping （SLAM）： types， challenges and applications［C］// Proceedings of the 2023 International Conference on Sustainable Computing and Smart Systems. Piscataway： IEEE， 2023： 643-650.
3	SHIN U， LEE K， LEE S， et al. Self-supervised depth and ego-motion estimation for monocular thermal video using multi-spectral consistency loss［J］. IEEE Robotics and Automation Letters， 2022， 7（2）： 1103-1110.
4	ZHU Z， PENG S， LARSSON V， et al. NICE-SLAM： neural implicit scalable encoding for SLAM［C］// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2022： 12776-12786.
5	KLEIN G， MURRAY D. Parallel tracking and mapping for small AR workspaces［C］// Proceedings of the 6th IEEE and ACM International Symposium on Mixed and Augmented Reality. Piscataway： IEEE， 2007： 225-234.
6	MUR-ARTAL R， MONTIEL J M M， TARDÓS J D. ORB-SLAM： a versatile and accurate monocular SLAM system［J］. IEEE Transactions on Robotics， 2015， 31（5）：1147-1163.
7	MUR-ARTAL R， TARDÓS J D. ORB-SLAM2： an open-source SLAM system for monocular， stereo， and RGB-D cameras［J］. IEEE Transactions on Robotics， 2017， 33（5）： 1255-1262.
8	CAMPOS C， ELVIRA R， RODRÍGUEZ J J G， et al. ORB-SLAM3： an accurate open-source library for visual， visual-inertial， and multimap SLAM［J］. IEEE Transactions on Robotics， 2021， 37（6）： 1874-1890.
9	YU Z， PENG S， NIEMEYERM， et al. MonoSDF： exploring monocular geometric cues for neural implicit surface reconstruction［C］// Proceedings of the 36th International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2022： 25018-25032.
10	CHAN E R， LIN C Z， CHAN M A， et al. Efficient geometry-aware 3D generative adversarial networks［C］// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2022： 16102-16112.
11	KARNEWAR A， RITSCHEL T， WANG O， et al. ReLU fields： the little non-linearity that could［C］// Proceedings of the 2022 ACM SIGGRAPH Conference. New York： ACM， 2022： No.27.
12	FRIDOVICH-KEIL S， YU A， TANCIK M， et al. Plenoxels： Radiance fields without neural networks［C］// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2022： 5491-5500.
13	EFTEKHAR A， SAX A， MALIK J， et al. Omnidata： a scalable pipeline for making multi-task mid-level vision datasets from 3D scans［C］// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2021： 10766-10776.
14	SARLIN P E， UNAGAR A， LARSSON M， et al. Back to the feature： learning robust camera localization from pixels to pose［C］// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2021： 3246-3256.
15	SHAN W， LU H， WANG S， et al. Improving robustness and accuracy via relative information encoding in 3D human pose estimation［C］// Proceedings of the 29th ACM International Conference on Multimedia. New York： ACM， 2021： 3446-3454.
16	LIU Z， LIN Y， CAO Y， et al. Swin Transformer： hierarchical vision Transformer using shifted windows［C］// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2021： 9992-10002.
17	CAMPOS C， MONTIEL J M M， TARDÓS J D. Inertial-only optimization for visual-inertial initialization［C］// Proceedings of the 2020 IEEE International Conference on Robotics and Automation. Piscataway： IEEE， 2020： 51-57.
18	ELVIRA R， TARDÓS J D， MONTIEL J M M. ORBSLAM-Atlas： a robust and accurate multi-map system［C］// Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway： IEEE， 2019： 6253-6259.
19	赵珊，管启，丁德锐，等. COVFast-LCD：一种组合ORB和VLAD特征的快速回环检测算法［J］. 小型微型计算机系统， 2023， 44（6）：1318-1323.
20	许芬，王振. 基于Kinect传感器和ORB特征的视觉SLAM算法设计与实现［J］. 计算机工程与科学， 2018， 40（5）： 836-841.
21	HOU L， XU X， ITO T， et al. An optimization-based IMU/lidar/camera co-calibration method［C］// Proceedings of the 7th International Conference on Robotics and Automation Engineering. Piscataway： IEEE， 2022： 118-122.
22	HUAI J， ZHUANG Y， LIN Y， et al. Continuous-time spatiotemporal calibration of a rolling shutter camera-IMU system［J］. IEEE Sensors Journal， 2022， 22（8）： 7920-7930.
23	REHDER J， NIKOLIC J， SCHNEIDER T， et al. Extending kalibr： calibrating the extrinsics of multiple IMUs and of individual axes［C］// Proceedings of the 2016 IEEE International Conference on Robotics and Automation. Piscataway： IEEE， 2016： 4304-4311.
24	KANNALA J， BRANDT S S. A generic camera model and calibration method for conventional， wide-angle， and fish-eye lenses［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2006， 28（8）： 1335-1340.
25	FURGALE P， REHDER J， SIEGWART R. Unified temporal and spatial calibration for multi-sensor systems［C］// Proceedings of the 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway： IEEE， 2013： 1280-1286.
26	WANG Z， PANG B， SONG Y， et al. Robust visual-inertial odometry based on a Kalman filter and factor graph［J］. IEEE Transactions on Intelligent Transportation Systems， 2023， 24（7）： 7048-7060.
27	YEH S H， WANG D， YAN W， et al. Detection of camera model inconsistency and the existence of optical image stabilization system［C］// Proceedings of the IEEE 18th International Conference on Automation Science and Engineering. Piscataway： IEEE， 2022： 1358-1363.
28	CAI J， YANG K， CHENG L， et al. Pixel-wise fisheye image correction method with single-view phase target［J］. IEEE Photonics Technology Letters， 2022， 34（19）： 1038-1041.
29	YIN H， LIU P X， ZHENG M. Stereo visual-inertial odometry with online initialization and extrinsic self-calibration［J］. IEEE Transactions on Instrumentation and Measurement， 2023， 72： No.9508210.
30	QIN T， LI P， SHEN S. VINS-Mono： a robust and versatile monocular visual-inertial state estimator［J］. IEEE Transactions on Robotics， 2018， 34（4）： 1004-1020.
31	FENG L， ZHANG X， PENG X， et al. Monocular visual-inertial odometry with point and line features using improved line feature extraction［C］// Proceedings of the 2022 China Automation Congress. Piscataway： IEEE， 2022： 4107-4112.
32	WANG S， ZHANG A， LI Y. Feature extraction algorithm based on improved ORB with adaptive threshold［C］// Proceedings of the 2023 IEEE International Conference on Industrial Technology. Piscataway： IEEE， 2023： 1-6.
33	CHUNG C M， TSENG Y C， HSU Y C， et al. Orbeez-SLAM： a real-time monocular visual slam with orb features and nerf-realized mapping［C］// Proceedings of the 2023 IEEE International Conference on Robotics and Automation. Piscataway： IEEE， 2023： 9400-9406.
34	DAVISON A J， REID I D， MOLTON N D， et al. MonoSLAM： real-time single camera SLAM［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2007， 29（6）： 1052-1067.
35	AZIMI A， HOSSEININAVEH AHMADABADIAN A， REMONDINO F. PKS： a photogrammetric key-frame selection method for visual-inertial systems built on ORB-SLAM3［J］. ISPRS Journal of Photogrammetry and Remote Sensing， 2022， 191： 18-32.
36	AZIMI A， HOSSEININAVEH A， REMONDINO F. A novel geometric key-frame selection method for visual-inertial SLAM and odometry systems［J］. The International Archives of the Photogrammetry， Remote Sensing and Spatial Information Sciences， 2022， XLIII-B2-2022： 9-14.
37	WU Z， PAN S， CHEN F， et al. A comprehensive survey on graph neural networks［J］. IEEE Transactions on Neural Networks and Learning Systems， 2021， 32（1）： 4-24.
38	YU Y， LIU Y， FU F， et al. Fast extrinsic calibration for multiple inertial measurement units in visual-inertial system［C］// Proceedings of the 2023 IEEE International Conference on Robotics and Automation. Piscataway： IEEE， 2023： 1-7.

坐标轴	平移	旋转	移动速度/（mm·s^-1）
X	J2	J4	240
Y	J1	J5	240
Z	J3	J6	240

坐标轴	平移	旋转	移动速度/（mm·s^-1）
X	J2	J4	240
Y	J1	J5	240
Z	J3	J6	240

参数	Camera1	Camera2	参数	Camera1	Camera2
f_x	242.863 1	242.866 9	k₁	-0.023 4	-0.033 7
f_y	243.056 7	242.833 7	k₂	0.096 1	0.119 1
c_x	318.899 7	319.951 9	k₃	-0.066 9	-0.089 2
c_y	239.252 9	239.632 2	k₄	0.012 1	0.019 6

参数	Camera1	Camera2	参数	Camera1	Camera2
f_x	242.863 1	242.866 9	k₁	-0.023 4	-0.033 7
f_y	243.056 7	242.833 7	k₂	0.096 1	0.119 1
c_x	318.899 7	319.951 9	k₃	-0.066 9	-0.089 2
c_y	239.252 9	239.632 2	k₄	0.012 1	0.019 6

指标	Camera1/px	Camera2/px	陀螺仪/（rad·s^-1）	加速度计/（m·s^-2）
平均值	0.114 6	0.102 8	0.008 4	0.061 6
标准差	0.073 4	0.063 6	0.007 7	0.136 6

基于ARM平台的VR头部追踪系统的设计与实现

Design and implementation of VR head tracking system based on ARM platform

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 35

参考文献 38

相关文章 15

编辑推荐

Metrics

误差类型	陀螺仪	加速度计
噪声密度	3.911 367 04 rad/s^0.5	0.000 570 16 m/s^1.5
随机游走	1.535 914 64 rad/s^1.5	2.607 395 48 m/s^2.5

追踪状态	状态值	追踪状态	状态值
SYSTEM_NOT_READY	-1	RECENTLY_LOST	3
NO_IMAGES_YET	0	LOST	4
NOT_INITIALIZED	1	OK_KLT	5
OK	2	CAMERA_COVERED	6

测试指标	测试结果
运行总时间/s	1 300
运行次数	10
单次运行时间/s	130
CPU平均占用率/%	28
CPU最大占用率/%	33

参数	值	参数	值
分辨率	2 048*1 536	视场角/（°）	60*46
帧率/（frame·s^-1）	120	最远工作距离/m	18

指标	ORB-SLAM3算法/优化后算法
指标	Max	Min	RMS	Mean
APE/m	0.687/0.263	0.027/0.011	0.251/0.124	0.203/0.100
ARE/（°）	30.48/4.170	0.056/1.002	5.989/1.933	4.845/1.534
RPE/m	0.096/0.077	0.001/0.000 1	0.011/0.008	0.009/0.005
RRE/（°）	4.871/1.677	0.023/0.002	0.893/0.264	0.753/0.200

[1]	刘健, 尤晨晨, 曹金明, 曾琼, 屠长河. 人手抓取物体的三维数据集的建立及应用[J]. 《计算机应用》唯一官方网站, 2024, 44(1): 278-284.
[2]	卢金燕, 戚肖克. 基于点线特征的解耦视觉伺服控制方法[J]. 《计算机应用》唯一官方网站, 2022, 42(8): 2556-2563.
[3]	王杰科, 李琳, 张海龙, 郑利平. 虚拟现实大空间下多虚拟目标被动触觉交互方法[J]. 《计算机应用》唯一官方网站, 2022, 42(11): 3544-3550.
[4]	侯守明, 贾超兰, 张明敏. 用于虚拟现实系统的眼动交互技术综述[J]. 《计算机应用》唯一官方网站, 2022, 42(11): 3534-3543.
[5]	王毅, 吕健, 尤乾, 赵泽宇, 颜宝明, 朱姝蔓. 基于菲茨定律的虚拟现实任意形状选择模型[J]. 计算机应用, 2020, 40(11): 3320-3326.
[6]	方国康, 李俊, 王垚儒. 基于深度学习的ARM平台实时人脸识别[J]. 计算机应用, 2019, 39(8): 2217-2222.
[7]	胡敏, 李冲, 路荣荣, 黄宏程. 基于三维视觉指导的运动想象训练性能分析[J]. 计算机应用, 2018, 38(3): 836-841.
[8]	周锋, 林楠, 陈小平. 基于六维线性插值的六自由度机械臂逆运动学方程求解方法[J]. 计算机应用, 2018, 38(2): 563-567.
[9]	林畅, 李国平, 赵海武, 王国中, 顾晓. 全景视频双环带映射算法[J]. 计算机应用, 2017, 37(9): 2631-2635.
[10]	吴赛文, 陈建, 孙晓颖. 面向视频感知的静电力触觉渲染方法[J]. 计算机应用, 2016, 36(4): 1137-1140.
[11]	李丹妮, 刘奇, 田琪, 赵雷昱, 何凌, 黄韫栀, 张劲. 面向上颌骨骨折复位手术的虚拟系统设计[J]. 计算机应用, 2015, 35(6): 1730-1733.
[12]	曹彦珏, 安博文, 李启明. 基于后处理的实时景深模拟与应用[J]. 计算机应用, 2015, 35(5): 1439-1443.
[13]	杨晓文, 张志纯, 况立群, 韩燮. 基于虚拟手的人机交互关键技术[J]. 计算机应用, 2015, 35(10): 2945-2949.
[14]	唐勇胡明华吴宏刚黄忠涛徐自励何东林. 基于FlightGear的A-SMGCS场面活动三维仿真[J]. 计算机应用, 2012, 32(11): 3228-3231.
[15]	黄茫茫周晓军魏燕定. 基于INtime的六自由度运动平台实时测控系统[J]. 计算机应用, 2011, 31(10): 2858-2860.