《计算机应用》唯一官方网站 ›› 2025, Vol. 45 ›› Issue (10): 3074-3082.DOI: 10.11772/j.issn.1001-9081.2024091307

• 人工智能 • 上一篇    

针对中文文本分类的多模态对抗样本生成方法

王永平1(), 刘垚2, 张晓琳2, 王静宇2, 刘立新3   

  1. 1.内蒙古科技大学 自动化与电气工程学院,内蒙古 包头 014010
    2.内蒙古科技大学 数智产业学院(网络安全学院),内蒙古 包头 014010
    3.中国人民大学 信息学院,北京 100872
  • 收稿日期:2024-09-06 修回日期:2025-02-23 接受日期:2025-02-27 发布日期:2025-03-26 出版日期:2025-10-10
  • 通讯作者: 王永平
  • 作者简介:王永平(1984—),女,内蒙古赤峰人,讲师,硕士,主要研究方向:人工智能安全、大数据隐私保护
    刘垚(1999—),女,河北唐山人,硕士研究生,主要研究方向:人工智能安全
    张晓琳(1966—),女,内蒙古包头人,教授,博士,CCF会员,主要研究方向:人工智能安全、大数据隐私保护
    王静宇(1976—),男,河南开封人,教授,博士,主要研究方向:大数据及安全、区块链及安全
    刘立新(1983—),女,内蒙古通辽人,讲师,博士研究生,主要研究方向:数据安全、隐私保护、区块链、数据库。
  • 基金资助:
    国家自然科学基金资助项目(62466045);内蒙古自然科学基金资助项目(2023MS06012);内蒙古自治区直属高校科研业务费专项资金资助项目(2023RCTD027);内蒙古自治区直属高校科研业务费专项资金资助项目(2024QNJS047)

Multimodal adversarial example generation method for Chinese text classification

Yongping WANG1(), Yao LIU2, Xiaolin ZHANG2, Jingyu WANG2, Lixin LIU3   

  1. 1.School of Automation and Electrical Engineering,Inner Mongolia University of Science and Technology,Baotou Inner Mongolia 014010,China
    2.School of Digital Intelligent Industry (School of Cyber Science and Technology),Inner Mongolia University of Science and Technology,Baotou Inner Mongolia 014010,China
    3.School of Information,Renmin University of China,Beijing 100872,China
  • Received:2024-09-06 Revised:2025-02-23 Accepted:2025-02-27 Online:2025-03-26 Published:2025-10-10
  • Contact: Yongping WANG
  • About author:王永平(1984—),女,内蒙古赤峰人,讲师,硕士,主要研究方向:人工智能安全、大数据隐私保护 imust_wyp@163.com
    LIU Yao, born in 1999, M. S. candidate. Her research interestsinclude artificial intelligence security
    ZHANG Xiaolin,born in 1966, Ph. D., professor. Her researchinterests include artificial intelligence security, big data privacy protection.
    WANG Jingyu,born in 1976, Ph. D., professor. His researchinterests include big data and security, blockchain and security.
    LIU Lixin,born in 1983, Ph. D. candidate, lecturer. Her researchinterests include data security, privacy protection, blockchain, database.

摘要:

针对现有中文文本对抗样本生成方法中重要词定位方法和变换策略单一,导致攻击成功率和对抗样本质量难以提高的问题,从汉字的形态、发音和语义角度,提出一种针对中文文本分类的多模态对抗样本生成方法。在计算词语重要性阶段,利用掩码模型和模型输出得到置信概率,并计算预测词的离散性且将它作为位置的敏感性,最终结合二者以确定扰动优先级;在对抗变换阶段,设计一种结合汉字的音形和语义特征的多模态攻击策略生成对抗样本,并通过词典、基于卷积神经网络(CNN)的字形相似比较模型和掩码语言模型(MLM)生成候选样本。实验结果表明,所提方法能对鲁棒性较强的BERT(Bidirectional Encoder Representations from Transformers)和RoBERTa(Robustly optimized BERT pretraining approach)模型实现了33.2%~65.8%的攻击成功率。可见,通过对抗训练生成的对抗样本可以提升模型的鲁棒性。

关键词: 深度学习, 文本分类, 对抗样本, 多模态, 对抗攻击

Abstract:

Aiming at the single important word localization method and transformation strategy in the existing Chinese text adversarial example generation methods, which leads to the problem that it is difficult to improve success rate of the attack and the quality of adversarial examples, a multimodal adversarial example generation method for Chinese text classification was proposed from the perspectives of morphology, pronunciation, and semantics of Chinese characters. In the stage of calculating word importance, the mask model and model output were used to obtain confidence probabilities, and discrete nature of the predicted word was calculated as the sensitivity of the position, and finally the two were combined to determine the perturbation priority. In the adversarial transformation stage, a multimodal attack strategy combining the phonological and semantic features of Chinese characters was designed to generate the adversarial examples, and the candidate examples were generated by the lexicon, the Convolutional Neural Network (CNN)-based character pattern similarity comparison model and the Masked Language Model (MLM). Experimental results show that the proposed method can achieve 33.2%-65.8% attack success rate against robust BERT (Bidirectional Encoder Representations from Transformers) and RoBERTa (Robustly optimized BERT pretraining approach) models. It can be seen that the generated adversarial examples can improve the robustness of the model through adversarial training.

Key words: deep learning, text classification, adversarial example, multimodal, adversarial attack

中图分类号: