Journal of Computer Applications

    Next Articles

Open-set source cell-phone identification based on feature interaction and representation enhancement

  

  • Received:2024-12-26 Revised:2025-01-06 Accepted:2025-01-10 Online:2025-01-15 Published:2025-01-15

基于特征交互和表示增强的语音手机来源开集识别

岳峰1,彭洋2,苏兆品3,张国富2,廉晨思4,杨波5,方振2   

  1. 1. 合肥工业大学科学技术研究院
    2. 合肥工业大学
    3. 合肥工业大学计算机与信息学院
    4. 安徽大学
    5. 安徽省公安厅物证鉴定管理处
  • 通讯作者: 苏兆品
  • 基金资助:
    教育部人文社会科学研究规划项目;安徽省自然科学基金项目;安徽省重点研究与开发计划

Abstract: Identifying cell-phones using recorded speech has always been a key issue in the field of multimedia forensics. However, the existing research on source cell-phone identification are all confined to the closed-set mode, i.e., the training set and the test set share the same categories, which cannot guarantee the recognition accuracy of unknown cell phones, leading to the limited applications for the unseen equipment types. An open-set source cell-phone identification is proposed based on feature interaction and representation enhancement. First, GlobalBlock is designed based on the multi-head attention Fastformer for capturing the global feature from the whole speech sample and obtaining rich device information. Second, LocalBlock is present based on SE-Res2Block for local feature extraction, which focuses on enhancing cell-phone features and suppressing the features that are not related to the source cell-phone identification. Then, an attention mechanism based feature fusion is used to deeply fuse global features with multi-layer local features. Finally, source cell phone identification network is designed using attention pooling to improve the recognition accuracy in open-set mode. Comparative experimental results on 13 different cell-phone brands and 86 different models show that the proposed method can achieve better recognition of unknown cell phones, and provide a referable technical solution for the open-set source cell-phone identification.

Key words: Source cell-phone, Open-set recognition, Feature interaction, Representation enhancement, Deep fusion

摘要: 基于手机设备音频的多媒体取证任务一直以来都是研究热点。但是已有语音手机设备识别任务均是局限于闭集,即训练集与测试集共享相同的类别集合,无法保证未知类别手机的识别精度,所以现有方法无法应用于未见过的手机设备。基于此,提出了一种基于特征交互和表示增强的语音手机来源开集识别方法。具体来说,首先设计了基于多头注意力模块Fastformer的全局特征提取模块GlobalBlock,更好地捕捉整个语音样本的全局信息,获得丰富的设备特征信息;其次,设计了基于SE-Res2Block的局部特征提取模块LocalBlocks,专注于增强跟手机设备信息相关的特征,抑制与手机来源识别无关的特征;然后,设计了基于注意力机制的特征融合机制,将全局特征和多层局部特征深度融合;最后设计基于注意力池化的手机来源确认网络,提高开集模式下的识别准确率。在13个不同手机设备品牌、86种不同型号的手机语音数据集上的对比实验结果表明,所提方法可以实现未知类别手机的识别,为语音手机来源的开集识别提供可参考的技术方案。

关键词: 语音手机来源, 开集识别, 特征交互, 表示增强, 深度融合

CLC Number: