计算机应用 ›› 2013, Vol. 33 ›› Issue (11): 3138-3140.

• 人工智能 • 上一篇    下一篇

基于模板匹配的快速语音关键词检出方法

朱国腾1,孙伟2   

  1. 1. 中山大学 信息科学与技术学院,广州 510006
    2. 中山大学 软件学院,广州 510006
  • 收稿日期:2013-04-22 修回日期:2013-06-09 出版日期:2013-11-01 发布日期:2013-12-04
  • 通讯作者: 孙伟
  • 作者简介:朱国腾(1988-),男,湖南郴州人,硕士研究生,主要研究方向:模式识别、语音处理;孙伟(1972-),男,江苏连云港人,教授,博士生导师,主要研究方向:多媒体安全、数字媒体。

Rapid speech keyword spotting method based on template matching

ZHU Guoteng1,SUN Wei2   

  1. 1. School of Information Science and Technology, Sun Yat-sen University, Guangzhou Guangdong 510006,China;
    2. School of Software, Sun Yat-sen University, Guangzhou Guangdong 510006,China
  • Received:2013-04-22 Revised:2013-06-09 Online:2013-12-04 Published:2013-11-01
  • Contact: SUN Wei

摘要: 在缺乏训练样本的情况下对语音信号进行关键词检出,基于模板匹配的方法与传统的方法相比,仍然能够对语音进行关键词检出。但是由于模板匹配方法计算局部最小距离的方式是逐帧移动,所以计算时间长。局部最小距离的极值点通常在音素分割点附近,利用这两者的位置关系并结合插值思想,提出一种快速的模板匹配方法。该方法通过在音素分割点之间插值计算局部最小距离,能够有效地缩短计算时间。在TIMIT和CASIA语料库中进行实验,改进的方法与常规的模板匹配方法相比较,快了约2.8倍。

关键词: 关键词检出, 动态时间规整, 音素分割, 插值

Abstract: When dealing with keywords detection without training samples, template matching-based keyword spotting can still be able to spot compared with the traditional method. However, template matching-based method is time-consuming, because it uses frame-by-frame move method to calculate the local minimum distance. The extreme points of the local minimum distance are usually near phoneme segmentation points. A fast template matching method can come out by combining their positions with interpolation idea. By using interpolation to generate the local minimum distance between phoneme segmentation points, this method can greatly reduce the calculation time. When running on the TIMIT and CASIA corpus, the improved method approximately is 2.8 times faster than the conventional template matching-based keyword spotting.

Key words: keyword spotting, Dynamic Time Warping (DTW), phoneme segmentation, interpolation

中图分类号: