Journal of Computer Applications

    Next Articles

Deepfake speech detection model based on quantum-Transformer

  

  • Received:2025-11-03 Revised:2026-01-22 Accepted:2026-02-11 Online:2026-03-12 Published:2026-03-12
  • Contact: CHANG Yan

基于量子-Transformer的伪造语音检测模型

宋子扬,昌燕*,闫丽丽,赵银山,刘洪林,宋海权   

  1. 成都信息工程大学 网络空间安全学院(芯谷产业学院),成都 610225
  • 通讯作者: 昌燕
  • 基金资助:
    国家自然科学基金项目.

Abstract: Voice forgery technology poses a potential threat to people's lives. Currently, classical fake speech detection models on the market face challenges such as performance improvement bottlenecks and excessive model parameters. To address these issues, a quantum‑Transformer based fake speech detection model—the Quantum Security Speech Model (QSSM)—was proposed. In this model, parameterized quantum circuits (PQC) were used to construct a quantum QKV mapping module for generating Query, Key, and Value vectors. The self‑attention computation between feature vectors was implemented via the Swap test, and quantum attention pooling based on PQC was employed to aggregate contextual information. Experimental results demonstrate that the quantum‑Transformer model reduces the equal error rate by 0.5% to 4.5% compared with classical models such as RawNet2 in fake speech detection tasks, while decreasing the parameter count by 43% relative to the classical Transformer model. This model provides a new pathway for deploying fake speech detection solutions in resource‑constrained environments.

Key words: quantum computing, machine learning, deepfake, attention module, Speech detection

摘要: 语音伪造技术正在潜在威胁着人们的生活,目前市面上的经典伪造语音检测模型正面临着性能提升瓶颈、模型参数过多等问题。针对这些问题,本文提出一种基于量子-Transformer的伪造语音检测模型——量子安全语音模型(Quantum  Security Speech Model,QSSM)模型。该模型使用参数化量子电路(Parameterized Quantum circuit,PQC)构建量子QKV映射模块以生成Query、Key和Value向量;通过Swap test实现特征向量间自注意力计算,利用PQC实现量子注意力池化以聚合上下文信息。实验结果表明,该量子-Transformer模型在伪造语音检测任务上的等错误率比RawNet2等经典模型下降0.5%~4.5%不等,与经典Transformer模型相比,参数量下降43%。该模型为资源受限环境下部署伪造语音检测方案提供了新的路径。

关键词: 关键词: 量子计算, 机器学习, 伪造语音, 注意力计算, 语音检测

CLC Number: