Journal of Computer Applications ›› 2021, Vol. 41 ›› Issue (3): 839-844.DOI: 10.11772/j.issn.1001-9081.2020060993

Special Issue: 多媒体计算与计算机仿真

• Multimedia computing and computer simulation • Previous Articles     Next Articles

3D virtual human animation generation based on dual-camera capture of facial expression and human pose

LIU Jie, LI Yi, ZHU Jiangping   

  1. College of Computer Science, Sichuan University, Chengdu Sichuan 610065, China
  • Received:2020-07-09 Revised:2020-10-27 Online:2021-03-10 Published:2021-01-15
  • Supported by:
    This work is partially supported by the Key Research and Development Program of Department of Science and Technology of Sichuan Province (2020YFG0306).


刘洁, 李毅, 朱江平   

  1. 四川大学 计算机学院, 成都 610065
  • 通讯作者: 李毅
  • 作者简介:刘洁(1996-),女,新疆阿克苏人,硕士研究生,主要研究方向:图像处理、计算机视觉;李毅(1967-),男,四川成都人,副教授,博士,主要研究方向:计算机视觉、图像处理、空管自动化系统;朱江平(1984-),男,四川达州人,副教授,博士,主要研究方向:光学三维传感、计算机视觉。
  • 基金资助:

Abstract: In order to generate a three-dimensional virtual human animation with rich expression and smooth movement, a method for generating three-dimensional virtual human animation based on synchronous capture of facial expression and human pose with two cameras was proposed. Firstly, the Transmission Control Protocol (TCP) network timestamp method was used to realize the time synchronization of the two cameras, and the ZHANG Zhengyou's calibration method was used to realize the spatial synchronization of the two cameras. Then, the two cameras were used to collect facial expressions and human poses respectively. When collecting facial expressions, the 2D feature points of the image were extracted and the regression of these 2D points was used to calculate the Facial Action Coding System (FACS) facial action unit in order to prepare for the realization of expression animation. Based on the standard head 3D coordinate, according to the camera internal parameters, the Efficient Perspective-n-Point (EPnP) algorithm was used to realize the head pose estimation. After that, the facial expression information was matched with the head pose estimation information. When collecting human poses, the Occlusion-Robust Pose-Map (ORPM) method was used to calculate the human poses and output data such as the position and rotation angle of each bone point. Finally, the established 3D virtual human model was used to show the effect of data-driven animation in the Unreal Engine 4 (UE4). Experimental results show that this method can simultaneously capture facial expressions and human poses and has the frame rate reached 20 fps in the experimental test, so it can generate natural and realistic three-dimensional animation in real time.

Key words: dual-camera, human pose, facial expression, virtual human animation, synchronous capture

摘要: 为了生成表情丰富、动作流畅的三维虚拟人动画,提出了一种基于双相机同步捕获面部表情及人体姿态生成三维虚拟人动画的方法。首先,采用传输控制协议(TCP)网络时间戳方法实现双相机时间同步,采用张正友标定法实现双相机空间同步。然后,利用双相机分别采集面部表情和人体姿态。采集面部表情时,提取图像的2D特征点,利用这些2D特征点回归计算得到面部行为编码系统(FACS)面部行为单元,为实现表情动画做准备;以标准头部3D坐标值为基准,根据相机内参,采用高效n点投影(EPnP)算法实现头部姿态估计;之后将面部表情信息和头部姿态估计信息进行匹配。采集人体姿态时,利用遮挡鲁棒姿势图(ORPM)方法计算人体姿态,输出每个骨骼点位置、旋转角度等数据。最后,在虚幻引擎4(UE4)中使用建立的虚拟人体三维模型来展示数据驱动动画的效果。实验结果表明,该方法能够同步捕获面部表情及人体姿态,而且在实验测试中的帧率达到20 fps,能实时生成自然真实的三维动画。

关键词: 双相机, 人体姿态, 面部表情, 虚拟人动画, 同步捕获

CLC Number: