计算机应用 ›› 2019, Vol. 39 ›› Issue (1): 126-130.DOI: 10.11772/j.issn.1001-9081.2018071596

• 2018年全国开放式分布与并行计算学术年会(DPCS 2018)论文 • 上一篇    下一篇

针对多种处理痕迹的数字语音取证算法

向立, 严迪群, 王让定, 李孝文   

  1. 宁波大学 信息科学与工程学院, 浙江 宁波 315211
  • 收稿日期:2018-07-19 修回日期:2018-08-07 出版日期:2019-01-10 发布日期:2019-01-21
  • 通讯作者: 严迪群
  • 作者简介:向立(1994-),男,湖南湘西人,硕士研究生,主要研究方向:多媒体通信、信息安全;严迪群(1979-),男,浙江余姚人,副教授,博士,CCF会员,主要研究方向:多媒体通信、信息安全;王让定(1962-),男,甘肃天水人,教授,博士,CCF会员,主要研究方向:多媒体通信安全、信息隐藏与隐写分析;李孝文(1996-),男,浙江温州人,硕士研究生,主要研究方向:多媒体通信、信息安全。
  • 基金资助:
    国家自然科学基金资助项目(U1736215,61672302);浙江省自然科学基金资助项目(LZ15F020002,LY17F020010);宁波市自然科学基金资助项目(2017A610123);宁波大学学科基金资助项目(XKXL1509,XKXL1503)。

Forensics algorithm of various operations for digital speech

XIANG Li, YAN Diqun, WANG Rangding, LI Xiaowen   

  1. Faculty of Electrical Engineering and Computer Science, Ningbo University, Ningbo Zhejiang 315211, China
  • Received:2018-07-19 Revised:2018-08-07 Online:2019-01-10 Published:2019-01-21
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (U1736215, 61672302), the Natural Science Foundation of Zhejiang Province (LZ15F020002, LY17F020010), the Natural Science Foundation of Ningbo (2017A610123), the Ningbo University Fund (XKXL1509, XKXL1503).

摘要: 现有的数字语音取证研究主要集中于对单一的某种操作进行检测,无法对不相关的操作进行判断。针对该问题,提出了一种能够同时检测经过变调、低通滤波、高通滤波和加噪这四种操作的数字语音取证方法。首先,计算语音的归一化梅尔频率倒谱系数(MFCC)统计矩特征;然后通过多个二分类器对特征进行训练,并组合投票得到多分类器;最后使用该多分类器对待测语音进行分类。在TIMIT以及UME语音库上的实验结果表明,归一化MFCC统计矩特征在库内实验中均达到了97%以上的检测率,且在对MP3压缩鲁棒性测试的实验中,检测率仍能保持在96%以上。

关键词: 语音取证, 梅尔频率倒谱系数, 处理痕迹, 多分类器

Abstract: Most existing forensic methods for digital speech aim at detecting a specific operation, which means that these methods can not identify various operations at a time. To solve the problem, a universal forensic algorithm for simultaneously detecting various operations, such as pitch modification, low-pass filtering, high-pass filtering, and noise adding was proposed. Firstly, the statistical moments of Mel-Frequency Cepstral Coefficients (MFCC) were calculated, and cepstrum mean and variance normalization were applied to the moments. Then, a multi-class classifier based on multiple two-class classifiers was constructed. Finally, the classifier was used to identify various types of speech operations. The experimental results on TIMIT and UME speech datasets show that the proposed universal features achieve detection accuracy over 97% for various speech operations. And the detection accuracy in the test of MP3 compression robustness is still above 96%.

Key words: speech forensics, Mel-Frequency Cepstral Coefficient (MFCC), operation trace, multi-class classifier

中图分类号: