《计算机应用》唯一官方网站 ›› 2025, Vol. 45 ›› Issue (12): 3888-3895.DOI: 10.11772/j.issn.1001-9081.2024111702
陈瑞龙1, 伊鹏1,2, 胡涛1, 卜佑军1,2
收稿日期:2024-12-04
修回日期:2025-02-19
接受日期:2025-02-25
发布日期:2025-03-04
出版日期:2025-12-10
通讯作者:
胡涛
作者简介:陈瑞龙(2000—),男,河南鹤壁人,硕士研究生,主要研究方向:网络入侵检测、深度学习基金资助:Ruilong CHEN1, Peng YI1,2, Tao HU1, Youjun BU1,2
Received:2024-12-04
Revised:2025-02-19
Accepted:2025-02-25
Online:2025-03-04
Published:2025-12-10
Contact:
Tao HU
About author:CHEN Ruilong, born in 2000, M. S. candidate. His research interests include network intrusion detection, deep learning.Supported by:摘要:
深度学习目前已经广泛应用于加密流量分类,然而它仍面临诸多挑战,例如用户数据隐私保护和持续学习能力等。针对上述问题,提出一种基于联邦类原型增量的加密流量分类方法(FPI-ETC)。在客户端本地模型训练阶段,将本地模型的Softmax分类器替换为原型分类器,以解决Softmax分类器造成的预测偏见问题。在新任务阶段,客户端利用旧类原型向量生成多个旧类范例,以避免本地模型遗忘过去的知识;服务器端加权聚合客户端上传的类原型向量,以实现类原型的迭代更新。实验结果表明,在客户端任务量为5且采样率为0.6时,FPI-ETC在ISCX VPN-nonVPN数据集上的最终全局精度相较于现有方法提升了9.93~33.45个百分点,在USTC-TFC2016数据集上的最终全局精度相较于现有方法提升了5.06~10.92个百分点,验证了FPI-ETC在动态更新的加密网络环境中能有效解决灾难性遗忘问题。
中图分类号:
陈瑞龙, 伊鹏, 胡涛, 卜佑军. 基于联邦类原型增量学习的加密流量分类方法[J]. 计算机应用, 2025, 45(12): 3888-3895.
Ruilong CHEN, Peng YI, Tao HU, Youjun BU. Encrypted traffic classification method based on federated prototypical incremental learning[J]. Journal of Computer Applications, 2025, 45(12): 3888-3895.
| 符号 | 说明 |
|---|---|
| 客户端模型的特征提取器 | |
| 客户端模型的分类器 | |
| 流量样本x对应的类别标签 | |
| N | 样本数 |
| 类别y的类原型向量 | |
| C | 客户端模型已学习的类别集合 |
| k | 客户端编号 |
| K | 客户端总数 |
| 当前任务阶段客户端k训练集中类别c的训练样本集 |
表1 符号及说明
Tab. 1 Symbols and explanations
| 符号 | 说明 |
|---|---|
| 客户端模型的特征提取器 | |
| 客户端模型的分类器 | |
| 流量样本x对应的类别标签 | |
| N | 样本数 |
| 类别y的类原型向量 | |
| C | 客户端模型已学习的类别集合 |
| k | 客户端编号 |
| K | 客户端总数 |
| 当前任务阶段客户端k训练集中类别c的训练样本集 |
| ISCX VPN-nonVPN | USTC-TFC2016 | ||||
|---|---|---|---|---|---|
| 流量类型 | 训练集样本数 | 测试集样本数 | 流量类型 | 训练集 样本数 | 测试集样本数 |
| vpn_chat | 3 629 | 915 | Benign traffics | 210 206 | 52 549 |
| vpn_email | 460 | 109 | Cridex | 13 108 | 3 277 |
| vpn_file | 1 415 | 379 | Geodo | 32 758 | 8 189 |
| vpn_streaming | 3 982 | 959 | Htbot | 5 093 | 1 273 |
| vpn_torrent | 575 | 134 | Miuref | 10 784 | 2 696 |
| vpn_voip | 8 292 | 2 068 | Neris | 27 032 | 6 758 |
| chat | 5 415 | 1 387 | Nsis-ay | 5 140 | 1 285 |
| 4 546 | 1 136 | Shifu | 7 707 | 1 926 | |
| file | 15 975 | 4 025 | Tinba | 6 803 | 1 700 |
| streaming | 16 068 | 3 932 | Virut | 26 482 | 6 620 |
| voip | 5 446 | 1 407 | Zeus | 8 776 | 2 194 |
表2 ISCX VPN-nonVPN与USTC-TFC2016数据集的信息
Tab. 2 Information of ISCX VPN-nonVPN and USTC-TFC2016 datasets
| ISCX VPN-nonVPN | USTC-TFC2016 | ||||
|---|---|---|---|---|---|
| 流量类型 | 训练集样本数 | 测试集样本数 | 流量类型 | 训练集 样本数 | 测试集样本数 |
| vpn_chat | 3 629 | 915 | Benign traffics | 210 206 | 52 549 |
| vpn_email | 460 | 109 | Cridex | 13 108 | 3 277 |
| vpn_file | 1 415 | 379 | Geodo | 32 758 | 8 189 |
| vpn_streaming | 3 982 | 959 | Htbot | 5 093 | 1 273 |
| vpn_torrent | 575 | 134 | Miuref | 10 784 | 2 696 |
| vpn_voip | 8 292 | 2 068 | Neris | 27 032 | 6 758 |
| chat | 5 415 | 1 387 | Nsis-ay | 5 140 | 1 285 |
| 4 546 | 1 136 | Shifu | 7 707 | 1 926 | |
| file | 15 975 | 4 025 | Tinba | 6 803 | 1 700 |
| streaming | 16 068 | 3 932 | Virut | 26 482 | 6 620 |
| voip | 5 446 | 1 407 | Zeus | 8 776 | 2 194 |
| 方法 | ISCX VPN-nonVPN | USTC-TFC2016 | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| T=1 | T=3 | T=5 | T=1 | T=3 | T=5 | |||||||
| s=1.0 | s=0.6 | s=1.0 | s=0.6 | s=1.0 | s=0.6 | s=1.0 | s=0.6 | s=1.0 | s=0.6 | s=1.0 | s=0.6 | |
| FedAvg | 95.40 | 95.21 | 68.41 | 55.23 | 41.00 | 33.00 | 99.76 | 99.75 | 86.52 | 83.65 | 70.62 | 61.40 |
| FedProx | 96.00 | 94.53 | 67.05 | 65.95 | 45.70 | 40.65 | 99.85 | 99.78 | 89.72 | 84.86 | 73.17 | 65.50 |
| GLFC | 95.16 | 94.56 | 86.51 | 67.39 | 62.38 | 56.52 | 99.32 | 99.16 | 87.51 | 85.63 | 75.84 | 67.26 |
| FPI‑ETC | 96.32 | 94.54 | 88.09 | 74.05 | 72.65 | 66.45 | 99.88 | 99.81 | 92.40 | 87.83 | 79.37 | 72.32 |
表3 ISCX VPN-nonVPN和USTC-TFC2016数据集上全局模型的最终全局精度 ( %)
Tab. 3 Final global accuracies of global models on ISCX VPN-nonVPN and USTC-TFC2016 datasets
| 方法 | ISCX VPN-nonVPN | USTC-TFC2016 | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| T=1 | T=3 | T=5 | T=1 | T=3 | T=5 | |||||||
| s=1.0 | s=0.6 | s=1.0 | s=0.6 | s=1.0 | s=0.6 | s=1.0 | s=0.6 | s=1.0 | s=0.6 | s=1.0 | s=0.6 | |
| FedAvg | 95.40 | 95.21 | 68.41 | 55.23 | 41.00 | 33.00 | 99.76 | 99.75 | 86.52 | 83.65 | 70.62 | 61.40 |
| FedProx | 96.00 | 94.53 | 67.05 | 65.95 | 45.70 | 40.65 | 99.85 | 99.78 | 89.72 | 84.86 | 73.17 | 65.50 |
| GLFC | 95.16 | 94.56 | 86.51 | 67.39 | 62.38 | 56.52 | 99.32 | 99.16 | 87.51 | 85.63 | 75.84 | 67.26 |
| FPI‑ETC | 96.32 | 94.54 | 88.09 | 74.05 | 72.65 | 66.45 | 99.88 | 99.81 | 92.40 | 87.83 | 79.37 | 72.32 |
| 方法 | Task 1→Task 2 | Task 2→Task 3 | Task 3→Task 4 | Task 4→Task 5 | 累积精度损失率 |
|---|---|---|---|---|---|
| FPI-ETC | 10.61 | 1.86 | 3.23 | 1.09 | 16.79 |
| FedAvg | 16.64 | 3.59 | 1.03 | 3.52 | 24.78 |
| FedProx | 18.98 | 3.13 | 4.68 | 2.12 | 28.91 |
| GLFC | 12.43 | 2.07 | 4.64 | 2.80 | 21.94 |
表4 USTC-TFC2016数据集上模型的精度损失率 (%)
Tab. 4 Accuracy loss rates of models on USTC-TFC2016 dataset
| 方法 | Task 1→Task 2 | Task 2→Task 3 | Task 3→Task 4 | Task 4→Task 5 | 累积精度损失率 |
|---|---|---|---|---|---|
| FPI-ETC | 10.61 | 1.86 | 3.23 | 1.09 | 16.79 |
| FedAvg | 16.64 | 3.59 | 1.03 | 3.52 | 24.78 |
| FedProx | 18.98 | 3.13 | 4.68 | 2.12 | 28.91 |
| GLFC | 12.43 | 2.07 | 4.64 | 2.80 | 21.94 |
范例样本 存储数 | ISCX VPN-nonVPN 全局精度/% | USTC-TFC2016 全局精度/% | ||
|---|---|---|---|---|
| FPI-ETC | GLFC | FPI-ETC | GLFC | |
| 100 | 42.13 | 40.23 | 52.31 | 50.12 |
| 300 | 65.23 | 57.89 | 67.23 | 66.25 |
| 600 | 74.05 | 67.39 | 87.83 | 85.63 |
| 800 | 75.46 | 68.65 | 88.56 | 85.89 |
| 1 000 | 76.01 | 68.81 | 88.94 | 86.61 |
| 2 000 | 76.61 | 69.23 | 90.97 | 88.86 |
表5 回放样本数对最终全局精度的影响
Tab.5 Influence of replay sample size on final global accuracy
范例样本 存储数 | ISCX VPN-nonVPN 全局精度/% | USTC-TFC2016 全局精度/% | ||
|---|---|---|---|---|
| FPI-ETC | GLFC | FPI-ETC | GLFC | |
| 100 | 42.13 | 40.23 | 52.31 | 50.12 |
| 300 | 65.23 | 57.89 | 67.23 | 66.25 |
| 600 | 74.05 | 67.39 | 87.83 | 85.63 |
| 800 | 75.46 | 68.65 | 88.56 | 85.89 |
| 1 000 | 76.01 | 68.81 | 88.94 | 86.61 |
| 2 000 | 76.61 | 69.23 | 90.97 | 88.86 |
| 方法 | ISCX VPN-nonVPN | USTC-TFC2016 | ||
|---|---|---|---|---|
| 总通信轮次 | 训练时间/s | 总通信轮次 | 训练时间/s | |
| FedAvg | 250 | 63.82 | 150 | 266.19 |
| FedProx | 200 | 71.31 | 130 | 294.61 |
| GLFC | 180 | 67.85 | 110 | 279.83 |
| FPI-ETC | 150 | 76.50 | 100 | 320.52 |
表6 不同方法的通信成本和训练时间开销
Tab. 6 Communication costs and training time overheads of different methods
| 方法 | ISCX VPN-nonVPN | USTC-TFC2016 | ||
|---|---|---|---|---|
| 总通信轮次 | 训练时间/s | 总通信轮次 | 训练时间/s | |
| FedAvg | 250 | 63.82 | 150 | 266.19 |
| FedProx | 200 | 71.31 | 130 | 294.61 |
| GLFC | 180 | 67.85 | 110 | 279.83 |
| FPI-ETC | 150 | 76.50 | 100 | 320.52 |
| 方法 | ISCX VPN-nonVPN | USTC-TFC2016 | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| T=1 | T=3 | T=5 | T=1 | T=3 | T=5 | |||||||
| s=1.0 | s=0.6 | s=1.0 | s=0.6 | s=1.0 | s=0.6 | s=1.0 | s=0.6 | s=1.0 | s=0.6 | s=1.0 | s=0.6 | |
| FPI-ETC | 96.32 | 94.54 | 88.09 | 74.05 | 72.65 | 66.45 | 99.88 | 99.81 | 92.40 | 87.83 | 79.37 | 72.32 |
| Non-PC | 95.72 | 95.02 | 83.85 | 63.13 | 49.66 | 41.75 | 99.73 | 99.76 | 88.36 | 82.12 | 69.75 | 65.63 |
| Non-FG | 94.57 | 94.03 | 70.38 | 60.25 | 48.28 | 45.88 | 99.65 | 99.05 | 84.35 | 83.45 | 66.54 | 62.78 |
表7 消融实验结果 ( %)
Tab. 7 Ablation experimental results
| 方法 | ISCX VPN-nonVPN | USTC-TFC2016 | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| T=1 | T=3 | T=5 | T=1 | T=3 | T=5 | |||||||
| s=1.0 | s=0.6 | s=1.0 | s=0.6 | s=1.0 | s=0.6 | s=1.0 | s=0.6 | s=1.0 | s=0.6 | s=1.0 | s=0.6 | |
| FPI-ETC | 96.32 | 94.54 | 88.09 | 74.05 | 72.65 | 66.45 | 99.88 | 99.81 | 92.40 | 87.83 | 79.37 | 72.32 |
| Non-PC | 95.72 | 95.02 | 83.85 | 63.13 | 49.66 | 41.75 | 99.73 | 99.76 | 88.36 | 82.12 | 69.75 | 65.63 |
| Non-FG | 94.57 | 94.03 | 70.38 | 60.25 | 48.28 | 45.88 | 99.65 | 99.05 | 84.35 | 83.45 | 66.54 | 62.78 |
| [1] | REZAEI S, LIU X. Deep learning for encrypted traffic classification: an overview[J]. IEEE Communications Magazine, 2019, 57(5): 76-81. |
| [2] | 连鸿飞,张浩,郭文忠. 一种数据增强与混合神经网络的异常流量检测[J]. 小型微型计算机系统, 2020, 41(4):786-793. |
| LIAN H F, ZHANG H, GUO W Z. Netflow anomaly detection based on data enhancement and hybrid neural network[J]. Journal of Chinese Computer Systems, 2020, 41(4): 786-793. | |
| [3] | APRUZZESE G, ANDREOLINI M, FERRETTI L, et al. Modeling realistic adversarial attacks against network intrusion detection systems[J]. Digital Threats: Research and Practice, 2022, 3(3): No.31. |
| [4] | KONEČNÝ J, McMAHAN H B, RAMAGE D, et al. Federated optimization: distributed machine learning for on-device intelligence[EB/OL]. [2024-10-21]. . |
| [5] | JIN Z, LIANG Z, HE M, et al. A federated semi-supervised learning approach for network traffic classification[J]. International Journal of Network Management, 2023, 33(3): No.e2222. |
| [6] | PARISI G I, KEMKER R, PART J L, et al. Continual lifelong learning with neural networks: a review[J]. Neural Networks, 2019, 113: 54-71. |
| [7] | MAI Z, LI R, KIM H, et al. Supervised contrastive replay: revisiting the nearest class mean classifier in online class-incremental continual learning[C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 3584-3594. |
| [8] | LI X, XIE J, SONG Q, et al. Let model keep evolving: incremental learning for encrypted traffic classification[J]. Computers and Security, 2024, 137: No.103624. |
| [9] | ZHU W, MA X, JIN Y, et al. ILETC: incremental learning for encrypted traffic classification using generative replay and exemplar[J]. Computer Networks, 2023, 224: No.109602. |
| [10] | SONG Z, ZHAO Z, ZHANG F, et al. I2RNN: an incremental and interpretable recurrent neural network for encrypted traffic classification[J]. IEEE Transactions on Dependable and Secure Computing, 2023(Early Access): 1-14. |
| [11] | HOU S, PAN X, LOY C C, et al. Learning a unified classifier incrementally via rebalancing[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 831-839. |
| [12] | LEE K, LEE H, SHIN J, et al. Overcoming catastrophic forgetting with unlabeled data in the wild[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 312-321. |
| [13] | BELOUADAHAE, POPESCU A. IL2M: class incremental learning with dual memory[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 583-592. |
| [14] | JIN Z, ZHOU J, LI B, et al. FL-IIDS: a novel federated learning-based incremental intrusion detection system[J]. Future Generation Computer Systems, 2024, 151: 57-70. |
| [15] | WANG Z, LI Z, FU M, et al. Network traffic classification based on federated semi-supervised learning[J]. Journal of Systems Architecture, 2024, 149: No.103091. |
| [16] | LI T, SAHU A K, ZAHEER M, et al. Federated optimization in heterogeneous networks[EB/OL]. [2024-10-21]. . |
| [17] | SHARAFALDIN I, HABIBI LASHKARI A, GHORBANI A A. Toward generating a new intrusion detection dataset and intrusion traffic characterization[C]// Proceedings of the 4th International Conference on Information Systems Security and Privacy — Volume 1. Setúbal: SciTePress, 2018: 108-116. |
| [18] | ZENG Y, LIU L, LIU L, et al. Global balanced experts for federated long-tailed learning[C]// Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2023: 4792-4802. |
| [19] | SHARMA S, XIAN Y, YU N, et al. Learning prototype classifiers for long-tailed recognition[C]// Proceedings of the 32nd International Joint Conference on Artificial Intelligence. California: ijcai.org, 2023: 1360-1368. |
| [20] | DRAPER-GIL G, LASHKARI A H, MAMUN M S I, et al. Characterization of encrypted and VPN traffic using time-related [EB/OL].[2024-10.28].. |
| [21] | WANG W, ZHU M, ZENG X, et al. Malware traffic classification using convolutional neural network for representation learning [C]// Proceedings of the 2017 International Conference on Information Networking. Piscataway: IEEE, 2017: 712-717. |
| [22] | MOOSAVI-DEZFOOLI S M, FAWZI A, FROSSARD P. DeepFool: a simple and accurate method to fool deep neural networks[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 2574-2582. |
| [23] | DRAPER-GIL G, LASHKARI A H, MAMUN M S I, et al. Characterization of encrypted and VPN traffic using time-related features[C]// Proceedings of the 2nd International Conference on Information Systems Security and Privacy. Setúbal: SciTePress, 2016: 407-414. |
| [24] | WANG W, ZHU M, WANG J, et al. End-to-end encrypted traffic classification with one-dimensional convolution neural networks[C]// Proceedings of the 2017 IEEE International Conference on Intelligence and Security Informatics. Piscataway: IEEE, 2017: 43-48. |
| [25] | ZHOU D W, WANG Q W, QI Z H, et al. Class-incremental learning: a survey[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024, 46(12): 9851-9873. |
| [26] | McMAHAN H B, MOORE E, RAMAGE D, et al. Communication-efficient learning of deep networks from decentralized data[C]// Proceedings of the 20th International Conference on Artificial Intelligence and Statistics. New York: JMLR.org, 2017: 1273-1282. |
| [27] | DONG J, WANG L, FANG Z, et al. Federated class-incremental learning[C]// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 10154-10163. |
| [1] | 张宏俊, 潘高军, 叶昊, 陆玉彬, 缪宜恒. 结合深度学习和张量分解的多源异构数据分析方法[J]. 《计算机应用》唯一官方网站, 2025, 45(9): 2838-2847. |
| [2] | 李进, 刘立群. 基于残差Swin Transformer的SAR与可见光图像融合[J]. 《计算机应用》唯一官方网站, 2025, 45(9): 2949-2956. |
| [3] | 殷兵, 凌震华, 林垠, 奚昌凤, 刘颖. 兼容缺失模态推理的情感识别方法[J]. 《计算机应用》唯一官方网站, 2025, 45(9): 2764-2772. |
| [4] | 景攀峰, 梁宇栋, 李超伟, 郭俊茹, 郭晋育. 基于师生学习的半监督图像去雾算法[J]. 《计算机应用》唯一官方网站, 2025, 45(9): 2975-2983. |
| [5] | 俞浩, 范菁, 孙伊航, 董华, 郗恩康. 联邦学习统计异质性综述[J]. 《计算机应用》唯一官方网站, 2025, 45(9): 2737-2746. |
| [6] | 李维刚, 邵佳乐, 田志强. 基于双注意力机制和多尺度融合的点云分类与分割网络[J]. 《计算机应用》唯一官方网站, 2025, 45(9): 3003-3010. |
| [7] | 许志雄, 李波, 边小勇, 胡其仁. 对抗样本嵌入注意力U型网络的3D医学图像分割[J]. 《计算机应用》唯一官方网站, 2025, 45(9): 3011-3016. |
| [8] | 苏锦涛, 葛丽娜, 肖礼广, 邹经, 王哲. 联邦学习中针对后门攻击的检测与防御方案[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2399-2408. |
| [9] | 葛丽娜, 王明禹, 田蕾. 联邦学习的高效性研究综述[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2387-2398. |
| [10] | 彭鹏, 蔡子婷, 刘雯玲, 陈才华, 曾维, 黄宝来. 基于CNN和双向GRU混合孪生网络的语音情感识别方法[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2515-2521. |
| [11] | 张硕, 孙国凯, 庄园, 冯小雨, 王敬之. 面向区块链节点分析的eclipse攻击动态检测方法[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2428-2436. |
| [12] | 廖炎华, 鄢元霞, 潘文林. 基于YOLOv9的交通路口图像的多目标检测算法[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2555-2565. |
| [13] | 索晋贤, 张丽萍, 闫盛, 王东奇, 张雅雯. 可解释的深度知识追踪方法综述[J]. 《计算机应用》唯一官方网站, 2025, 45(7): 2043-2055. |
| [14] | 王震洲, 郭方方, 宿景芳, 苏鹤, 王建超. 面向智能巡检的视觉模型鲁棒性优化方法[J]. 《计算机应用》唯一官方网站, 2025, 45(7): 2361-2368. |
| [15] | 张宏扬, 张淑芬, 谷铮. 面向个性化与公平性的联邦学习算法[J]. 《计算机应用》唯一官方网站, 2025, 45(7): 2123-2131. |
| 阅读次数 | ||||||
|
全文 |
|
|||||
|
摘要 |
|
|||||