Journal of Computer Applications ›› 2024, Vol. 44 ›› Issue (11): 3479-3486.DOI: 10.11772/j.issn.1001-9081.2023101518

• Cyber security • Previous Articles     Next Articles

Model integrity verification framework of deep neural network based on fragile fingerprint

Xiang LIN1,2, Biao JIN1,2, Weijing YOU1,2, Zhiqiang YAO1,2, Jinbo XIONG1,2()   

  1. 1.College of Computer and Cyber Security,Fujian Normal University,Fuzhou Fujian 350117,China
    2.Fujian Provincial Key Laboratory of Network Security and Cryptology (Fujian Normal University),Fuzhou Fujian 350007,China
  • Received:2023-11-07 Revised:2024-01-10 Accepted:2024-01-12 Online:2024-11-13 Published:2024-11-10
  • Contact: Jinbo XIONG
  • About author:LIN Xiang, born in 1996, M. S. candidate. His research interests include artificial intelligence security.
    JIN Biao, born in 1985, Ph. D., associate professor. His research interests include information security, privacy protection.
    YOU Weijing, born in 1994, Ph. D., associate professor. Her research interests include data security, artificial intelligence security.
    YAO Zhiqiang, born in 1967, Ph. D., professor. His research interests include information security, privacy protection.
  • Supported by:
    National Natural Science Foundation of China(62272102);Key Program of Natural Science Foundation of Fujian Province(2023J02014);Natural Science Foundation of Fujian Province(2023J01531)

基于脆弱指纹的深度神经网络模型完整性验证框架

林翔1,2, 金彪1,2, 尤玮婧1,2, 姚志强1,2, 熊金波1,2()   

  1. 1.福建师范大学 计算机与网络空间安全学院,福州 350117
    2.福建省网络安全与密码技术重点实验室(福建师范大学),福州 350007
  • 通讯作者: 熊金波
  • 作者简介:林翔(1996—),男,福建厦门人,硕士研究生,CCF会员,主要研究方向:人工智能安全
    金彪(1985—),男,安徽六安人,副教授,博士,CCF会员,主要研究方向:信息安全、隐私保护
    尤玮婧(1994—),女,福建三明人,副教授,博士,CCF会员,主要研究方向:数据安全、人工智能安全
    姚志强(1967—),男,福建莆田人,教授,博士,CCF高级会员,主要研究方向:信息安全、隐私保护
  • 基金资助:
    国家自然科学基金资助项目(62272102);福建省自然科学基金重点项目(2023J02014);福建省自然科学基金资助项目(2023J01531)

Abstract:

Pre-trained models are susceptible to attacks implemented by external enemies, such as model fine-tuning and pruning, which destroy their integrity. To address this issue, a fragile fingerprint framework FFWAS (Fragile Fingerprint With Adversarial Samples) for black-box models was proposed. Firstly, a model replication framework without prior knowledge was introduced, and independent model copy for each user was generated by FFWAS. Then, a black-box approach was employed to place a fragile fingerprint trigger set at the model's boundary. If the model was modified and the boundaries were changed, the trigger set would be misclassified. Finally, the integrity of the model was verified by users with the help of the fragile fingerprint trigger set on the model replicas, and if the recognition rate of the trigger set fell below the predefined threshold, it indicated that the model integrity had been compromised. The effectiveness and fragility of FFWAS were analyzed through experiments based on two publicly datasets MNIST and CIFAR-10. Experimental results demonstrate that under both model fine-tuning and pruning attacks, the fingerprint recognition rates of FFWAS significantly decrease compared to the complete model and fall below the predefined thresholds. Compared to Deep Neural Network Authentication framework (DeepAuth) based on model uniqueness and fragile signatures, FFWAS exhibits approximately 22% and 16% improvements in the similarity between the trigger set and the original samples on two datasets, indicating better stealthiness of FFWAS.

Key words: ?neural network, pre-trained model, fragile fingerprint, model integrity, black-box model

摘要:

预训练模型容易受到外部敌手实施的模型微调和模型剪枝等攻击,导致它的完整性被破坏。针对这一问题,提出一种针对黑盒模型的脆弱指纹框架FFWAS (Fragile Fingerprint With Adversarial Samples)。首先,提出一种无先验知识的模型复制框架,而FFWAS为每一位用户创建独立的模型副本;其次,利用黑盒方法在模型边界放置脆弱指纹触发集,若模型发生修改,边界发生变化,触发集将被错误分类;最后,用户借助模型副本上的脆弱指纹触发集对模型的完整性进行验证,若触发集的识别率低于预设阈值,则意味着模型完整性已被破坏。基于2种公开数据集MNIST和CIFAR-10对FFWAS的有效性和脆弱性进行实验分析,结果表明,在模型微调和剪枝攻击下,FFWAS的指纹识别率相较于完整模型均明显下降并低于设定阈值;与基于模型唯一性和脆弱签名的深度神经网络认证框架(DeepAuth)相比,FFWAS的触发集与原始样本在2个数据集上的相似性分别提高了约22%和16%,表明FFWAS具有更好的隐蔽性。

关键词: 神经网络, 预训练模型, 脆弱指纹, 模型完整性, 黑盒模型

CLC Number: