• •    

DPCS2017+15+一种基于CNN的翻录语音检测算法

李璨1,王让定2,严迪群3   

  1. 1. 宁波大学 信息科学与工程学院
    2. 宁波大学信息科学与工程学院
    3. 宁波大学
  • 收稿日期:2017-08-01 修回日期:2017-08-20 发布日期:2017-08-20
  • 通讯作者: 李璨

DPCS2017+15+Recaptured voice replay detection based on convolutional neural network

  • Received:2017-08-01 Revised:2017-08-20 Online:2017-08-20
  • Contact: Can LI

摘要: 随着高保真度、便携式录音设备的普及,使得翻录语音与原始语音具有较高的相似性,常被不法分子用于对说话人认证系统进行攻击,以达到非法认证的目的,危害了合法用户的权益。为改善识别系统此类安全问题,则必须防止翻录语音的成功闯入。通过提取原始语音与翻录语音的语谱图,并将其输入到卷积神经网络中,提出了一种基于Convolutional Neural Network (CNN)的翻录语音检测方法。搭建了适应于检测翻录语音的网络框架,分析讨论了输入不同窗移的语谱图对检测率的影响,对不同偷录及回放设备的翻录语音进行了交叉实验检测,并与现有的经典算法进行了对比。实验结果表明,该方法能够准确地判断待测语音是否为翻录语音,其检测率达到了99.26%,且性能优于现有算法。

关键词: 卷积神经网络, 翻录语音检测, 语谱图, 录音设备, 网络框架

Abstract: With the popularity of portable and high-fidelity recording devices, the high similarity between the recaptured voice and the original voice may be utilized for attack the speaker verification system illegally by the criminals, which harms the legitimate interests of the system user.In order to improve the ability of resisting this attack, this paper proposes a CNN-based algorithm of detecting recaptured speeches by using spectrogram of the voice. For the detection task, a novel network structure is constructed, and the effect of the spectrograms with different window shifts are discussed.In addition, the cross-over experiments for various eavesdropping and replay devices are construct in this paper.Experimental results demonstrate that the proposed method can accurately discriminate whether the recording voice is recaptured or not. The detection rate achieves 99.26%, and furthermore is higher than of the state-of-the-art methods.

Key words: convolutional neural network(CNN), recapture voice detection, spectrogram, recording devices, network structure

中图分类号: