《计算机应用》唯一官方网站 ›› 2020, Vol. 40 ›› Issue (2): 589-594.DOI: 10.11772/j.issn.1001-9081.2019071183

• 应用前沿、交叉与综合 • 上一篇    下一篇

基于去噪自编码器和长短时记忆网络的语音测谎算法

傅洪亮, 雷沛之()   

  1. 河南工业大学 信息科学与工程学院,郑州 450001
  • 收稿日期:2019-07-08 修回日期:2019-09-01 接受日期:2019-09-02 发布日期:2019-09-19 出版日期:2020-02-10
  • 通讯作者: 雷沛之
  • 作者简介:傅洪亮(1965—),男,河南郑州人,教授,博士,主要研究方向:现代信号处理;
  • 基金资助:
    国家自然科学基金资助项目(61601170)

Speech deception detection algorithm based on denoising autoencoder and long short-term memory network

Hongliang FU, Peizhi LEI()   

  1. School of Information Science and Engineering,Henan University of Technology,Zhengzhou 450001,China
  • Received:2019-07-08 Revised:2019-09-01 Accepted:2019-09-02 Online:2019-09-19 Published:2020-02-10
  • Contact: Peizhi LEI
  • About author:FU Hongliang, born in 1965, Ph. D., professor. His research interests include modern signal processing.
  • Supported by:
    the National Natural Science Foundation of China(61601170)

摘要:

为进一步提升语音测谎性能,提出了一种基于去噪自编码器(DAE)和长短时记忆(LSTM)网络的语音测谎算法。首先,该算法构建了优化后的DAE和LSTM的并行结构PDL;然后,提取出语音中的人工特征并输入DAE以获取更具鲁棒性的特征,同时,将语音加窗分帧后提取出的Mel谱逐帧输入到LSTM进行帧级深度特征的学习;最后,将这两种特征通过全连接层及批归一化处理后实现融合,使用softmax分类器进行谎言识别。CSC(Columbia-SRI-Colorado)库和自建语料库上的实验结果显示,融合特征分类的识别准确率分别为65.18%和68.04%,相比其他对比算法的识别准确率最高分别提升了5.56%和7.22%,表明所提算法可以有效提高谎言识别精度。

关键词: 去噪自编码器, 长短时记忆网络, 语音特征, 特征融合, 测谎

Abstract:

In order to further improve the performance of speech deception detection, a speech deception detection algorithm based on Denoising AutoEncoder (DAE) and Long Short-Term Memory (LSTM) network was proposed. Firstly, a parallel structure of DAE and LSTM was constructed, namely PDL (Parallel connection of DAE and LSTM). Then, artificial features in the speech were extracted and put into the DAE to obtain more robust features. Simultaneously, the Mel spectrums extracted after adding windows to the speech and framing were input into LSTM frame-by-frame for frame-level depth feature learning. Finally, these two types of features were merged by the fully connected layer and the batch normalization, and the softmax classifier was used for the deception recognition. The experimental results on the CSC (Columbia-SRI-Colorado) corpus and the self-built corpus show that the recognition accuracy of the classification with fusion feature is 65.18% and 68.04% respectively, which is up to 5.56% and 7.22% higher than those of other algorithms, indicating that the proposed algorithm can effectively improve the accuracy of deception recognition.

Key words: Denoising AutoEncoder (DAE), Long Short-Term Memory (LSTM) network, speech feature, feature fusion, deception detection

中图分类号: