Journal of Computer Applications ›› 2024, Vol. 44 ›› Issue (7): 2080-2086.DOI: 10.11772/j.issn.1001-9081.2023071056
• Cyber security • Previous Articles Next Articles
Zhi ZHANG, Xin LI(), Naifu YE, Kaixi HU
Received:
2023-08-04
Revised:
2023-10-01
Accepted:
2023-10-10
Online:
2023-10-26
Published:
2024-07-10
Contact:
Xin LI
About author:
ZHANG Zhi, born in 1999, M. S. candidate. His research interests include cyberspace security.Supported by:
通讯作者:
李欣
作者简介:
张郅(1999—),男,山西吕梁人,硕士研究生,主要研究方向:网络空间安全;基金资助:
CLC Number:
Zhi ZHANG, Xin LI, Naifu YE, Kaixi HU. DKP: defending against model stealing attacks based on dark knowledge protection[J]. Journal of Computer Applications, 2024, 44(7): 2080-2086.
张郅, 李欣, 叶乃夫, 胡凯茜. 基于暗知识保护的模型窃取防御技术DKP[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2080-2086.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2023071056
数据集 | 样本数 | 任务 | ||
---|---|---|---|---|
训练集 | 验证集 | 测试集 | ||
TP-US | 22 142 | 2 767 | 2 767 | 语义分析 |
Felp | 520 000 | 40 000 | 1 000 | 语义分析 |
AG | 112 000 | 1 457 | 1 457 | 主题分类 |
Blog | 7 098 | 887 | 887 | 主题分类 |
Tab. 1 Dataset information statistics
数据集 | 样本数 | 任务 | ||
---|---|---|---|---|
训练集 | 验证集 | 测试集 | ||
TP-US | 22 142 | 2 767 | 2 767 | 语义分析 |
Felp | 520 000 | 40 000 | 1 000 | 语义分析 |
AG | 112 000 | 1 457 | 1 457 | 主题分类 |
Blog | 7 098 | 887 | 887 | 主题分类 |
防御方法 | 描述 | 参数说明 |
---|---|---|
拉普拉斯噪声 | 在预测概率分布中加入方差σ的拉普拉斯噪声 | σ从范围(0,1)中随机选取 |
高斯噪声 | 在预测概率分布中加入方差σ的高斯噪声 | σ从范围(0,1)中随机选取 |
变温 | 改变softmax层上的温度系数T操纵后验概率分布 | 温度系数取固定值 |
分区变温(本文方法) | 通过设置合理的区间,针对不同的后验概率使用不同的温度系数操纵后验概率分布 | 分区设置不同的温度系数 |
Tab. 2 Comparison of defense methods by changing posterior probability
防御方法 | 描述 | 参数说明 |
---|---|---|
拉普拉斯噪声 | 在预测概率分布中加入方差σ的拉普拉斯噪声 | σ从范围(0,1)中随机选取 |
高斯噪声 | 在预测概率分布中加入方差σ的高斯噪声 | σ从范围(0,1)中随机选取 |
变温 | 改变softmax层上的温度系数T操纵后验概率分布 | 温度系数取固定值 |
分区变温(本文方法) | 通过设置合理的区间,针对不同的后验概率使用不同的温度系数操纵后验概率分布 | 分区设置不同的温度系数 |
模型 | TP-US | Yelp | AG | Blog |
---|---|---|---|---|
受害模型 | 85.5 | 95.6 | 94.5 | 97.1 |
盗版模型 | 85.3 | 94.1 | 90.5 | 88.2 |
Tab. 3 Accuracies of victimiz model and piracy model
模型 | TP-US | Yelp | AG | Blog |
---|---|---|---|---|
受害模型 | 85.5 | 95.6 | 94.5 | 97.1 |
盗版模型 | 85.3 | 94.1 | 90.5 | 88.2 |
防御方法 | TP-US | Yelp | AG | Blog | ||||
---|---|---|---|---|---|---|---|---|
准确率/% | 变化百分点 | 准确率/% | 变化百分点 | 准确率/% | 变化百分点 | 准确率/% | 变化百分点 | |
不采取防御 | 85.3 | — | 94.1 | — | 90.5 | — | 88.2 | — |
拉普拉斯噪声 | 84.4 | -0.9 | 92.4 | -1.7 | 90.2 | -0.3 | 86.3 | -1.9 |
高斯噪声[ | 85.6 | +0.3 | 92.7 | -1.4 | 90.2 | -0.3 | 86.2 | -2.0 |
变温(T=0.0)[ | 84.6 | -0.7 | 93.7 | -0.4 | 90.0 | -0.5 | 85.6 | -2.6 |
变温(T=0.5)[ | 85.1 | -0.2 | 93.8 | -0.3 | 90.3 | -0.2 | 85.7 | -2.5 |
变温(T=5.0)[ | 85.3 | -0.0 | 94.5 | +0.4 | 90.9 | +0.4 | 86.7 | -1.5 |
分区变温(本文方法) | 77.6 | -7.7 | 91.3 | -2.8 | 85.1 | -5.4 | 70.8 | -17.4 |
Tab. 4 Accuracies of different defense methods on piracy models
防御方法 | TP-US | Yelp | AG | Blog | ||||
---|---|---|---|---|---|---|---|---|
准确率/% | 变化百分点 | 准确率/% | 变化百分点 | 准确率/% | 变化百分点 | 准确率/% | 变化百分点 | |
不采取防御 | 85.3 | — | 94.1 | — | 90.5 | — | 88.2 | — |
拉普拉斯噪声 | 84.4 | -0.9 | 92.4 | -1.7 | 90.2 | -0.3 | 86.3 | -1.9 |
高斯噪声[ | 85.6 | +0.3 | 92.7 | -1.4 | 90.2 | -0.3 | 86.2 | -2.0 |
变温(T=0.0)[ | 84.6 | -0.7 | 93.7 | -0.4 | 90.0 | -0.5 | 85.6 | -2.6 |
变温(T=0.5)[ | 85.1 | -0.2 | 93.8 | -0.3 | 90.3 | -0.2 | 85.7 | -2.5 |
变温(T=5.0)[ | 85.3 | -0.0 | 94.5 | +0.4 | 90.9 | +0.4 | 86.7 | -1.5 |
分区变温(本文方法) | 77.6 | -7.7 | 91.3 | -2.8 | 85.1 | -5.4 | 70.8 | -17.4 |
类别 | 初始置信度分布 | 防御后置信度分布 |
---|---|---|
主题1 | 0.00 | 0.10 |
主题2 | 0.00 | 0.10 |
主题3 | 1.00 | 0.11 |
主题4 | 0.00 | 0.10 |
主题5 | 0.00 | 0.10 |
主题6 | 0.00 | 0.10 |
主题7 | 0.00 | 0.10 |
主题8 | 0.00 | 0.10 |
主题9 | 0.00 | 0.10 |
主题10 | 0.00 | 0.10 |
Tab. 5 Confidence distribution before and after defense
类别 | 初始置信度分布 | 防御后置信度分布 |
---|---|---|
主题1 | 0.00 | 0.10 |
主题2 | 0.00 | 0.10 |
主题3 | 1.00 | 0.11 |
主题4 | 0.00 | 0.10 |
主题5 | 0.00 | 0.10 |
主题6 | 0.00 | 0.10 |
主题7 | 0.00 | 0.10 |
主题8 | 0.00 | 0.10 |
主题9 | 0.00 | 0.10 |
主题10 | 0.00 | 0.10 |
受害模型 | 受害模型准确率 |
---|---|
BERT-large | 96.6 |
BERT-base | 97.1 |
Tab. 6 Accuracies of victim models
受害模型 | 受害模型准确率 |
---|---|
BERT-large | 96.6 |
BERT-base | 97.1 |
受害模型 | 盗版模型 | 盗版模型准确率 | 防御后准确率 |
---|---|---|---|
BERT-large | BERT-large | 88.3 | 34.0 |
BERT-base | BERT-large | 87.5 | 34.0 |
BERT-base | BERT-base | 88.2 | 76.1 |
BERT-large | BERT-base | 88.7 | 72.2 |
Tab. 7 Accuracies when defending against stealing attacks with different model structures
受害模型 | 盗版模型 | 盗版模型准确率 | 防御后准确率 |
---|---|---|---|
BERT-large | BERT-large | 88.3 | 34.0 |
BERT-base | BERT-large | 87.5 | 34.0 |
BERT-base | BERT-base | 88.2 | 76.1 |
BERT-large | BERT-base | 88.7 | 72.2 |
1 | JORDAN M I, MITCHELL T M. Machine learning: trends, perspectives, and prospects [J]. Science, 2015, 349(6245): 255-260. |
2 | GAO J, GALLEY M, LI L. Neural approaches to conversational AI: question answering, task-oriented dialogues and social chatbots [EB/OL]. (2018-09-21) [2023-06-20]. . |
3 | LI J. Recent advances in end-to-end automatic speech recognition [EB/OL]. (2022-11-01) [2023-06-20]. . |
4 | HAMMOUCHE R, ATTIA A, AKHROUF S, et al. Gabor filter bank with deep autoencoder based face recognition system [J]. Expert Systems with Applications, 2022, 197: 116743. |
5 | HU Z, ZHANG Y, XING Y, et al. Toward human-centered automated driving: a novel spatiotemporal vision transformer-enabled head tracker [J]. IEEE Vehicular Technology Magazine, 2022, 17(4): 57-64. |
6 | PAPERNOT N, McDANIEL P, GOODFELLOW I, et al. Practical black-box attacks against machine learning [C]// Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security. New York: ACM, 2017: 506-519. |
7 | HE X, LYU L, SUN L, et al. Model extraction and adversarial transferability, your BERT is vulnerable! [C]// Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg: ACL, 2021: 2006-2012. |
8 | 李长升,汪诗烨,李延铭,等.人工智能的逆向工程——反向智能研究综述[J].软件学报, 2023, 34(2): 712-732. |
LI C S, WANG S Y, LI Y M, et al. Survey on reverse-engineering artificial intelligence [J]. Journal of Software, 2023, 34(2): 712-732. | |
9 | JUUTI M, SZYLLER S, MARCHAL S, et al. PRADA: protecting against DNN model stealing attacks [C]// Proceedings of the 2019 IEEE European Symposium on Security and Privacy. Piscataway: IEEE, 2019: 512-527. |
10 | ZANELLA-BÉGUELIN S, TOPLE S, PAVERD A, et al. Grey-box extraction of natural language models [C]// Proceedings of the 38th International Conference on Machine Learning. New York: PMLR, 2021: 12278-12286. |
11 | KRISHNA K, TOMAR G S, PARIKH A P, et al. Thieves on sesame street! model extraction of BERT-based APIs [EB/OL]. [2023-06-20]. . |
12 | TRAMÈR F, ZHANG F, JUELS A, et al. Stealing machine learning models via prediction APIs [C]// Proceedings of the 25th USENIX Security Symposium. Berkley: USENIX Association, 2016: 601-618. |
13 | MURPHY G C, NOTKIN D. Lightweight source model extraction [J]. ACM SIGSOFT Software Engineering Notes, 1995, 20(4): 116-127. |
14 | YOSHIDA K, KUBOTA T, SHIOZAKI M, et al. Model-extraction attack against FPGA-DNN accelerator utilizing correlation electromagnetic analysis [C]// Proceedings of the 2019 IEEE 27th Annual International Symposium on Field-Programmable Custom Computing Machines. Piscataway: IEEE, 2019: 318-318. |
15 | MILLI S, SCHMIDT L, DRAGAN A D, et al. Model reconstruction from model explanations [C]// Proceedings of the 2019 Conference on Fairness, Accountability, and Transparency. New York: ACM, 2019: 1-9. |
16 | OREKONDY T, SCHIELE B, FRITZ M. Prediction poisoning: Towards defenses against DNN model stealing attacks [C/OL]// Proceedings of the 2019 International Conference on Learning Representations ( 2020-03-03) [2023-07-01]. . |
17 | WANG B, GONG N Z. Stealing hyperparameters in machine learning [C]// Proceedings of the 2018 IEEE Symposium on Security and Privacy. Piscataway: IEEE, 2018: 36-52. |
18 | OH S J, SCHIELE B, FRITZ M. Towards reverse-engineering black-box neural networks [C]// Explainable AI: Interpreting, Explaining and Visualizing Deep Learning. Cham: Springer, 2019: 121-144. |
19 | LOWD D, MEEK C. Adversarial learning [C]// Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining. New York: ACM, 2005: 641-647. |
20 | CORREIA-SILVA J R, BERRIEL R F, BADUE C, et al. Copycat_CNN: stealing knowledge by persuading confession with random non-labeled data [C]// Proceedings of the 2018 International Joint Conference on Neural Networks. Piscataway: IEEE, 2018: 1-8. |
21 | OREKONDY T, SCHIELE B, FRITZ M. Knockoff nets: stealing functionality of black-box models [C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 4949-4958. |
22 | SHI Y, SAGDUYU Y, GRUSHIN A. How to steal a machine learning classifier with deep learning [C]// Proceedings of the 2017 IEEE International Symposium on Technologies for Homeland Security. Piscataway: IEEE, 2017: 1-5. |
23 | PAL S, GUPTA Y, SHUKLA A, et al. ActiveThief: model extraction using active learning and unannotated public data [J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(1): 865-872. |
24 | KESARWANI M, MUKHOTY B, ARYA V, et al. Model extraction warning in MLaaS paradigm [C]// Proceedings of the 34th Annual Computer Security Applications Conference. New York: ACM, 2018: 371-380. |
25 | TRUEX S, LIU L, GURSOY M E, et al. Effects of differential privacy and data skewness on membership inference vulnerability [C]// Proceedings of the 2019 First IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications. Piscataway: IEEE, 2019: 82-91. |
26 | TRUEX S, LIU L, GURSOY M E, et al. Demystifying membership inference attacks in machine learning as a service [J]. IEEE Transactions on Services Computing, 2021, 14(6): 2073-2089. |
27 | LEE T, EDWARDS B, MOLLOY I, et al. Defending against machine learning model stealing attacks using deceptive perturbations [EB/OL]. (2018-05-31) [2023-07-01]. . |
28 | CHOQUETTE-CHOO C A, TRAMER F, CARLINI N, et al. Label-only membership inference attacks [C]// Proceedings of the 38th International Conference on Machine Learning. New York: PMLR, 2021: 1964-1974. |
29 | HINTON G, VINYALS O, DEAN J. Distilling the knowledge in a neural network [EB/OL]. (2015-03-09) [2023-06-20]. . |
30 | LI Q, PENG H, LI J, et al. A survey on text classification: from traditional to deep learning [J]. ACM Transactions on Intelligent Systems and Technology, 2022, 13(2): No. 31. |
31 | HOVY D, JOHANNSEN A, SØGAARD A. User review sites as a resource for large-scale sociolinguistic studies [C]// Proceedings of the 24th international conference on World Wide Web. Republic and Canton of Geneva, Switzerland: International World Wide Web Conferences Steering Committee, 2015: 452-461. |
32 | HUANG W, WANG J. Character-level convolutional network for text classification applied to Chinese corpus [EB/OL]. (2016-11-14) [2023-06-20]. . |
33 | DEL CORSO G M, GULLI A, ROMANI F. Ranking a stream of news [C]// Proceedings of the 14th International Conference on World Wide Web. New York: ACM, 2005: 97-106. |
34 | SCHLER J, KOPPEL M, ARGAMON S, et al. Effects of age and gender on blogging [C/OL]// Proceedings of the 2006 AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs [2023-06-20]. . |
35 | HERMANN K M, KOCISKY T, GREFENSTETTE E, et al. Teaching machines to read and comprehend [C]// Proceedings of the 28th International Conference on Neural Information Processing Systems. Cambridge: MIT Press, 2015: 1693-1701. |
[1] | Jing QIN, Zhiguang QIN, Fali LI, Yueheng PENG. Diagnosis of major depressive disorder based on probabilistic sparse self-attention neural network [J]. Journal of Computer Applications, 2024, 44(9): 2970-2974. |
[2] | Xiyuan WANG, Zhancheng ZHANG, Shaokang XU, Baocheng ZHANG, Xiaoqing LUO, Fuyuan HU. Unsupervised cross-domain transfer network for 3D/2D registration in surgical navigation [J]. Journal of Computer Applications, 2024, 44(9): 2911-2918. |
[3] | Yexin PAN, Zhe YANG. Optimization model for small object detection based on multi-level feature bidirectional fusion [J]. Journal of Computer Applications, 2024, 44(9): 2871-2877. |
[4] | Yunchuan HUANG, Yongquan JIANG, Juntao HUANG, Yan YANG. Molecular toxicity prediction based on meta graph isomorphism network [J]. Journal of Computer Applications, 2024, 44(9): 2964-2969. |
[5] | Shunyong LI, Shiyi LI, Rui XU, Xingwang ZHAO. Incomplete multi-view clustering algorithm based on self-attention fusion [J]. Journal of Computer Applications, 2024, 44(9): 2696-2703. |
[6] | Yuhan LIU, Genlin JI, Hongping ZHANG. Video pedestrian anomaly detection method based on skeleton graph and mixed attention [J]. Journal of Computer Applications, 2024, 44(8): 2551-2557. |
[7] | Yanjie GU, Yingjun ZHANG, Xiaoqian LIU, Wei ZHOU, Wei SUN. Traffic flow forecasting via spatial-temporal multi-graph fusion [J]. Journal of Computer Applications, 2024, 44(8): 2618-2625. |
[8] | Qianhong SHI, Yan YANG, Yongquan JIANG, Xiaocao OUYANG, Wubo FAN, Qiang CHEN, Tao JIANG, Yuan LI. Multi-granularity abrupt change fitting network for air quality prediction [J]. Journal of Computer Applications, 2024, 44(8): 2643-2650. |
[9] | Zheng WU, Zhiyou CHENG, Zhentian WANG, Chuanjian WANG, Sheng WANG, Hui XU. Deep learning-based classification of head movement amplitude during patient anaesthesia resuscitation [J]. Journal of Computer Applications, 2024, 44(7): 2258-2263. |
[10] | Huanhuan LI, Tianqiang HUANG, Xuemei DING, Haifeng LUO, Liqing HUANG. Public traffic demand prediction based on multi-scale spatial-temporal graph convolutional network [J]. Journal of Computer Applications, 2024, 44(7): 2065-2072. |
[11] | Yiqun ZHAO, Zhiyu ZHANG, Xue DONG. Anisotropic travel time computation method based on dense residual connection physical information neural networks [J]. Journal of Computer Applications, 2024, 44(7): 2310-2318. |
[12] | Song XU, Wenbo ZHANG, Yifan WANG. Lightweight video salient object detection network based on spatiotemporal information [J]. Journal of Computer Applications, 2024, 44(7): 2192-2199. |
[13] | Xun SUN, Ruifeng FENG, Yanru CHEN. Monocular 3D object detection method integrating depth and instance segmentation [J]. Journal of Computer Applications, 2024, 44(7): 2208-2215. |
[14] | Yajuan ZHAO, Fanjun MENG, Xingjian XU. Review of online education learner knowledge tracing [J]. Journal of Computer Applications, 2024, 44(6): 1683-1698. |
[15] | Yuanjiong LIU, Maozheng HE, Yibin HUANG, Cheng QIAN. Ship identification model based on ResNet50 and improved attention mechanism [J]. Journal of Computer Applications, 2024, 44(6): 1935-1941. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||