计算机应用 ›› 2012, Vol. 32 ›› Issue (07): 2070-2073.DOI: 10.3724/SP.J.1087.2012.02070

• 典型应用 • 上一篇    下一篇

采用特征空间随机映射的鲁棒性语音识别

周阿转,俞一彪   

  1. 苏州大学 语音技术研究室,江苏 苏州215006
  • 收稿日期:2011-12-13 修回日期:2012-02-16 发布日期:2012-07-05 出版日期:2012-07-01
  • 通讯作者: 周阿转
  • 作者简介:周阿转(1981-),女,陕西西安人,硕士研究生,主要研究方向:语音信号处理、语音识别;俞一彪(1962-),男,江苏无锡人,教授,博士,主要研究方向:语音信号处理、信息隐藏、多媒体处理。

Robust speech recognition by adopting random projection in feature space

ZHOU A-zhuan,YU Yi-biao   

  1. Speech Technology Laboratory, Soochow University, Suzhou Jiangsu 215006, China
  • Received:2011-12-13 Revised:2012-02-16 Online:2012-07-05 Published:2012-07-01
  • Contact: ZHOU A-zhuan

摘要: 针对语音识别性能受噪声干扰而显著降低的问题,提出一种采用特征空间随机映射(RP)的鲁棒性语音语音识别方法,并应用于汽车驾驶环境下的语音识别系统。首先,将原始语音特征参数采用随机矩阵线性映射到新的特征空间,使新的特征参数以最大概率保持原始特征之间距离的同时更加接近于高斯分布;然后训练隐马尔可夫模型(HMM),测试时结合多数投票表决方法对初始模式匹配结果进行判决并得到最终语音识别结果。采用日本情报处理学会车载环境下语音识别数据库CENSREC-2进行实验分析,结果表明,随机映射特征使得汽车驾驶环境下的语音识别性能有了很大改善。

关键词: 语音识别, 随机映射, 多数投票表决, CENSREC-2

Abstract: To improve speech recognition in noisy environment such as in driving car, a new method which adopted Random Projection (RP) of feature space was proposed in this paper. First, original speech feature coefficients were projected into a new feature space using random matrixes to make the new coefficients have distribution more similar to the Gaussian but preserve the original distances among features with maximum probability. Then Hidden Markov Model (HMM) of every word was trained. In the test stage, the initial pattern matching results were further processed with majority voting strategy then to make a final speech recognition decision. The experimental results based on speech recognition database CENSREC-2 of Japan Information Processing Association demonstrate the effectiveness of random projection of feature space, which greatly improves the speech recognition performance in driving car.

Key words: speech recognition, Random Projection (RP), majority voting, CENSREC-2

中图分类号: