• •    

基于SIFT的说话人唇动识别研究

马新军,吴晨晨,仲乾元,李园园   

  1. 哈尔滨工业大学(深圳)
  • 收稿日期:2017-03-08 修回日期:2017-05-24 发布日期:2017-05-24
  • 通讯作者: 吴晨晨

Research on Lip Motion Recognition of Speaker Baed on SIFT

  • Received:2017-03-08 Revised:2017-05-24 Online:2017-05-24

摘要: 摘 要: 针对唇部特征提取维度过高以及对尺度空间敏感的问题,提出了一种基于尺度不变特征变换算法(SIFT)做特征提取来进行说话人身份认证的技术。首先,提出了一种简单的视频帧图片规整算法,将不同长度的唇动视频规整到同一的长度,提取出具有代表性的唇动图片。然后提出一种在SIFT关键点的基础上,进行纹理和运动特征的提取算法,并经过 主成分分析(PCA)算法的整合,最终得到具有代表性的唇动特征进行认证。最后,根据所得到的特征,提出了一种简单的分类算法。实验结果显示,和常见的局部二元模式(LBP)特征和方向梯度直方图(HOG)特征相比较,该特征提取算法的错误拒绝率和错误接受率表现更佳。说明整个说话人唇动特征识别算法是有效的,能够得到较为理想的结果。

关键词: 关键词: 唇部特征, 尺度不变特征变换算法, 特征提取, 识别, 分类

Abstract: Abstract: Since the feature of lip has high dimension and is sensitive to scale space, a method to do feature extraction for speaker authentication based on scale-invariant feature transform (SIFT) algorithm was proposed. Firstly, a simple video frame image neat algorithm was proposed. Namely, adjusting lip motion videos with different lengths to the same length and extracting representative lip motion pictures. Then, the other algorithm based on key points of SIFT was presented, through which texture and motion features can be extracted. Integrated with principal components analysis (PCA) algorithm,, typical lip motion features can be obtained to do identity recognition. Finally, a simple classification algorithm was presented according to obtained features. Compared with the common local binary model (LBP) feature and histogram of oriental gradient (HOG) feature, experimental results show that false acceptance rate (FAR) and false rejection rate (FRR) of the proposed feature extraction algorithm peform better, proving that the whole speaker lip motion recognition algorithm is effective and can get ideal results.

Key words: Keywords:Keywords: lip feature, Scale-invariant feature transform (SIFT), feature extraction, recognition, classification

中图分类号: