Journal of Computer Applications ›› 2021, Vol. 41 ›› Issue (3): 774-779.DOI: 10.11772/j.issn.1001-9081.2020060763

Special Issue: 网络空间安全

• Cyber security • Previous Articles     Next Articles

Speech steganalysis method based on deep residual network

REN Yiming, WANG Rangding, YAN Diqun, LIN Yuzhen   

  1. Faculty of Electrical Engineering and Computer Science, Ningbo University, Ningbo Zhejiang 315211, China
  • Received:2020-06-08 Revised:2020-10-14 Online:2021-03-10 Published:2020-12-22
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (U1736215, 61672302, 61901237), the Zhejiang Natural Science Foundation (LY20F020010, LY17F020010), the Open Foundation of Key Laboratory of Mobile Network Application Technology of Zhejiang Province (F2018001).

基于深度残差网络的语音隐写分析方法

任奕茗, 王让定, 严迪群, 林昱臻   

  1. 宁波大学 信息科学与工程学院, 浙江 宁波 315211
  • 通讯作者: 王让定
  • 作者简介:任奕茗(1994-),男,浙江义乌人,硕士研究生,主要研究方向:多媒体通信、信息安全;王让定(1962-),男,甘肃天水人,教授,博士,CCF会员,主要研究方向:多媒体通信、信息安全、隐写分析;严迪群(1979-),男,浙江余姚人,副教授,博士,CCF会员,主要研究方向:多媒体通信、信息安全;林昱臻(1994-),男,浙江宁波人,硕士研究生,主要研究方向:多媒体通信、信息安全。
  • 基金资助:
    国家自然科学基金资助项目(U1736215,61672302,61901237);浙江省自然科学基金资助项目(LY20F020010,LY17F020010);浙江省移动网应用技术重点实验室开放基金资助项目(F2018001)。

Abstract: Concerning the low detection performance of the Least Significant Bit (LSB) steganography method on WAV-format speech, a speech steganalysis method based on deep residual network was proposed. First, the residual signal of the input speech signal was calculated through a fixed convolutional layer composed of multiple sets of high-pass filters, and a truncated linear unit was adopted to perform truncation to the obtained residual signal. Then, a deep network was constructed by stacking the convolutional layer and the designed residual block to extract the deep feature information of steganography. Finally, the final classification result was output by the classifier composed of the fully connected layer and Softmax layer. Experimental results under the different secret information embedding rates of two steganography methods,Hide4PGP (Hide 4 Pretty Good Privacy) and LSBmatching (Least Significant Bit matching), show that compared with the exising Convolutional Neural Network (CNN)-based steganalysis methods, the proposed method can achieve better performance, and compared with LinNet, the proposed method has the detection accuracy increased by 7 percentage points on detecting Hide4PGP with the embedding rate of 0.1 bps (bit per sample).

Key words: audio, speech, Least Significant Bit (LSB), steganalysis, deep residual network, deep learning

摘要: 针对目前以WAV格式语音为载体的最低有效位(LSB)隐写方法的检测性能较低的问题,提出了一种基于深度残差网络的语音隐写分析方法。首先,通过多组高通滤波器组成的固定卷积层来计算输入语音信号的残差信号,并利用截断线性激活单元对得到的残差信号进行截断操作;然后,通过卷积层与设计的残差块的堆叠来构建深度网络,以提取深层次的隐写特征数据;最后,利用全连接层与Softmax层组成的分类器输出最终的分类结果。实验结果表明,在Hide4PGP和LSBmatching两种隐写方法的不同密信嵌入率下,所提出模型的检测正确率都要优于现有的基于卷积神经网络(CNN)的隐写分析方法。对于0.1 bps嵌入率的Hide4PGP隐写方法,该隐写分析模型的检测正确率比LinNet提高了近7个百分点。

关键词: 音频, 语音, 最低有效位, 隐写分析, 深度残差网络, 深度学习

CLC Number: