Journal of Computer Applications ›› 2019, Vol. 39 ›› Issue (1): 126-130.DOI: 10.11772/j.issn.1001-9081.2018071596

Previous Articles     Next Articles

Forensics algorithm of various operations for digital speech

XIANG Li, YAN Diqun, WANG Rangding, LI Xiaowen   

  1. Faculty of Electrical Engineering and Computer Science, Ningbo University, Ningbo Zhejiang 315211, China
  • Received:2018-07-19 Revised:2018-08-07 Online:2019-01-10 Published:2019-01-21
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (U1736215, 61672302), the Natural Science Foundation of Zhejiang Province (LZ15F020002, LY17F020010), the Natural Science Foundation of Ningbo (2017A610123), the Ningbo University Fund (XKXL1509, XKXL1503).

针对多种处理痕迹的数字语音取证算法

向立, 严迪群, 王让定, 李孝文   

  1. 宁波大学 信息科学与工程学院, 浙江 宁波 315211
  • 通讯作者: 严迪群
  • 作者简介:向立(1994-),男,湖南湘西人,硕士研究生,主要研究方向:多媒体通信、信息安全;严迪群(1979-),男,浙江余姚人,副教授,博士,CCF会员,主要研究方向:多媒体通信、信息安全;王让定(1962-),男,甘肃天水人,教授,博士,CCF会员,主要研究方向:多媒体通信安全、信息隐藏与隐写分析;李孝文(1996-),男,浙江温州人,硕士研究生,主要研究方向:多媒体通信、信息安全。
  • 基金资助:
    国家自然科学基金资助项目(U1736215,61672302);浙江省自然科学基金资助项目(LZ15F020002,LY17F020010);宁波市自然科学基金资助项目(2017A610123);宁波大学学科基金资助项目(XKXL1509,XKXL1503)。

Abstract: Most existing forensic methods for digital speech aim at detecting a specific operation, which means that these methods can not identify various operations at a time. To solve the problem, a universal forensic algorithm for simultaneously detecting various operations, such as pitch modification, low-pass filtering, high-pass filtering, and noise adding was proposed. Firstly, the statistical moments of Mel-Frequency Cepstral Coefficients (MFCC) were calculated, and cepstrum mean and variance normalization were applied to the moments. Then, a multi-class classifier based on multiple two-class classifiers was constructed. Finally, the classifier was used to identify various types of speech operations. The experimental results on TIMIT and UME speech datasets show that the proposed universal features achieve detection accuracy over 97% for various speech operations. And the detection accuracy in the test of MP3 compression robustness is still above 96%.

Key words: speech forensics, Mel-Frequency Cepstral Coefficient (MFCC), operation trace, multi-class classifier

摘要: 现有的数字语音取证研究主要集中于对单一的某种操作进行检测,无法对不相关的操作进行判断。针对该问题,提出了一种能够同时检测经过变调、低通滤波、高通滤波和加噪这四种操作的数字语音取证方法。首先,计算语音的归一化梅尔频率倒谱系数(MFCC)统计矩特征;然后通过多个二分类器对特征进行训练,并组合投票得到多分类器;最后使用该多分类器对待测语音进行分类。在TIMIT以及UME语音库上的实验结果表明,归一化MFCC统计矩特征在库内实验中均达到了97%以上的检测率,且在对MP3压缩鲁棒性测试的实验中,检测率仍能保持在96%以上。

关键词: 语音取证, 梅尔频率倒谱系数, 处理痕迹, 多分类器

CLC Number: