Journal of Computer Applications ›› 2026, Vol. 46 ›› Issue (4): 1158-1170.DOI: 10.11772/j.issn.1001-9081.2025040474
• Cyber security • Previous Articles
Xiaoyu WANG1, Xin LI1,2,3(
), Di XUE1, Zhangtao JIANG1, Wei WANG1, Yanjun XIAO4
Received:2025-04-29
Revised:2025-06-26
Accepted:2025-06-27
Online:2025-07-07
Published:2026-04-10
Contact:
Xin LI
About author:WANG Xiaoyu, born in 2001, M. S. candidate. Her research interests include large language models, risk assessment.Supported by:
王晓宇1, 李欣1,2,3(
), 薛迪1, 蒋章涛1, 王威1, 肖岩军4
通讯作者:
李欣
作者简介:王晓宇(2001—),女,湖北宜昌人,硕士研究生,CCF会员,主要研究方向:大语言模型、风险评估基金资助:CLC Number:
Xiaoyu WANG, Xin LI, Di XUE, Zhangtao JIANG, Wei WANG, Yanjun XIAO. Vulnerability classification framework for video surveillance network security based on large language models[J]. Journal of Computer Applications, 2026, 46(4): 1158-1170.
王晓宇, 李欣, 薛迪, 蒋章涛, 王威, 肖岩军. 基于大语言模型的视频监控网络安全漏洞分类框架[J]. 《计算机应用》唯一官方网站, 2026, 46(4): 1158-1170.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2025040474
| 层级 | 总样本数N | 类别数 | |||
|---|---|---|---|---|---|
| DT2-init | 5 676 | 21 | 1 600 | 2 038 | 2 038 |
| DT2-new | 4 742 | 26 | 480 | 2 131 | 2 131 |
| DT3 | 8 427 | 124 | 0 | 0 | 8 427 |
| DT4 | 5 624 | 65 | 0 | 0 | 5 624 |
| DT5 | 3 569 | 36 | 0 | 0 | 3 569 |
Tab. 1 Dataset division details by level
| 层级 | 总样本数N | 类别数 | |||
|---|---|---|---|---|---|
| DT2-init | 5 676 | 21 | 1 600 | 2 038 | 2 038 |
| DT2-new | 4 742 | 26 | 480 | 2 131 | 2 131 |
| DT3 | 8 427 | 124 | 0 | 0 | 8 427 |
| DT4 | 5 624 | 65 | 0 | 0 | 5 624 |
| DT5 | 3 569 | 36 | 0 | 0 | 3 569 |
| 实际类别 | 预测类别 | |
|---|---|---|
| 预测为正类(1) | 预测为负类(0) | |
| 实际为正类(1) | 真实正例(TP) | 假负例(FN) |
| 实际为负类(0) | 假正例(FP) | 真实负例(TN) |
Tab. 2 Confusion matrix of binary classification problem
| 实际类别 | 预测类别 | |
|---|---|---|
| 预测为正类(1) | 预测为负类(0) | |
| 实际为正类(1) | 真实正例(TP) | 假负例(FN) |
| 实际为负类(0) | 假正例(FP) | 真实负例(TN) |
| 顶层类别数 | 测试集分类准确率 | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| (10,5) | 33 | 0.913 | 0.872 | 0.851 | 0.818 | 0.775 | 0.625 | 0.809 | 0.361 |
| (10,10) | 33 | 0.920 | 0.894 | 0.848 | 0.805 | 0.749 | 0.689 | 0.817 | 0.475 |
| (20,8) | 30 | 0.948 | 0.922 | 0.824 | 0.819 | 0.779 | 0.638 | 0.822 | 0.441 |
| (20,16) | 30 | 0.930 | 0.922 | 0.837 | 0.828 | 0.805 | 0.745 | 0.844 | 0.620 |
| (40,8) | 26 | 0.936 | 0.855 | 0.838 | 0.829 | / | 0.799 | 0.852 | 0.613 |
| (40,16) | 26 | 0.940 | 0.884 | 0.863 | 0.858 | / | 0.833 | 0.876 | 0.658 |
| (40,20) | 26 | 0.952 | 0.912 | 0.872 | 0.868 | / | 0.851 | 0.891 | 0.670 |
| (60,18) | 20 | 0.934 | 0.923 | 0.901 | 0.898 | / | 0.886 | 0.908 | 0.758 |
| (60,30) | 20 | 0.940 | 0.933 | 0.902 | 0.882 | / | 0.871 | 0.905 | 0.736 |
Tab. 3 Experimental results with different parameter settings
| 顶层类别数 | 测试集分类准确率 | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| (10,5) | 33 | 0.913 | 0.872 | 0.851 | 0.818 | 0.775 | 0.625 | 0.809 | 0.361 |
| (10,10) | 33 | 0.920 | 0.894 | 0.848 | 0.805 | 0.749 | 0.689 | 0.817 | 0.475 |
| (20,8) | 30 | 0.948 | 0.922 | 0.824 | 0.819 | 0.779 | 0.638 | 0.822 | 0.441 |
| (20,16) | 30 | 0.930 | 0.922 | 0.837 | 0.828 | 0.805 | 0.745 | 0.844 | 0.620 |
| (40,8) | 26 | 0.936 | 0.855 | 0.838 | 0.829 | / | 0.799 | 0.852 | 0.613 |
| (40,16) | 26 | 0.940 | 0.884 | 0.863 | 0.858 | / | 0.833 | 0.876 | 0.658 |
| (40,20) | 26 | 0.952 | 0.912 | 0.872 | 0.868 | / | 0.851 | 0.891 | 0.670 |
| (60,18) | 20 | 0.934 | 0.923 | 0.901 | 0.898 | / | 0.886 | 0.908 | 0.758 |
| (60,30) | 20 | 0.940 | 0.933 | 0.902 | 0.882 | / | 0.871 | 0.905 | 0.736 |
| 阶段 | 分类器 | Acc | MCC |
|---|---|---|---|
| 层次判断 | SCP | 85.2 | 37.7 |
| 顶层分类 | Text2Weak(3-small) | 39.8 | 38.3 |
| Text2Weak(3-small)(top5) | 71.6 | 60.5 | |
| SCP | 68.2 | 66.4 | |
| IVCF-LLM(GLM-4-Flash) | 78.5 | 65.8 | |
| IVCF-LLM(GPT-3.5 Turbo) | 85.9 | 78.4 | |
| 子层分类 | SCP | 55.0 | 45.8 |
| IVCF-LLM(GLM-4-Flash) | 83.7 | 64.5 | |
| IVCF-LLM(GPT-3.5 Turbo) | 86.9 | 67.1 | |
| 全局分类 | Text2Weak(3-small) | 20.3 | 10.6 |
| Text2Weak(3-small)(top5) | 36.3 | 25.1 | |
| Text2Weak(add-002) | 17.1 | 7.0 | |
| Text2Weak(add-002)(top5) | 27.1 | 14.3 | |
| SCP | 52.9 | 51.5 | |
| Prompt(GPT-3.5 Turbo) | 55.8 | 50.9 | |
| IVCF-LLM(GLM-4-Flash) | 65.2 | 58.3 | |
| IVCF-LLM(GPT-3.5 Turbo) | 75.0 | 65.7 |
Tab. 4 Results of comparison experiments
| 阶段 | 分类器 | Acc | MCC |
|---|---|---|---|
| 层次判断 | SCP | 85.2 | 37.7 |
| 顶层分类 | Text2Weak(3-small) | 39.8 | 38.3 |
| Text2Weak(3-small)(top5) | 71.6 | 60.5 | |
| SCP | 68.2 | 66.4 | |
| IVCF-LLM(GLM-4-Flash) | 78.5 | 65.8 | |
| IVCF-LLM(GPT-3.5 Turbo) | 85.9 | 78.4 | |
| 子层分类 | SCP | 55.0 | 45.8 |
| IVCF-LLM(GLM-4-Flash) | 83.7 | 64.5 | |
| IVCF-LLM(GPT-3.5 Turbo) | 86.9 | 67.1 | |
| 全局分类 | Text2Weak(3-small) | 20.3 | 10.6 |
| Text2Weak(3-small)(top5) | 36.3 | 25.1 | |
| Text2Weak(add-002) | 17.1 | 7.0 | |
| Text2Weak(add-002)(top5) | 27.1 | 14.3 | |
| SCP | 52.9 | 51.5 | |
| Prompt(GPT-3.5 Turbo) | 55.8 | 50.9 | |
| IVCF-LLM(GLM-4-Flash) | 65.2 | 58.3 | |
| IVCF-LLM(GPT-3.5 Turbo) | 75.0 | 65.7 |
| 阶段 | 分类器 | Acc | MCC |
|---|---|---|---|
| T2 | IVCF-LLM(未优化) | 55.9 | 47.8 |
| IVCF-LLM(删除1) | 79.9 | 69.0 | |
| IVCF-LLM(删除2) | 81.5 | 72.0 | |
| IVCF-LLM(删除3) | 78.5 | 68.7 | |
| IVCF-LLM(删除4) | 74.4 | 63.6 | |
| IVCF-LLM | 85.9 | 78.4 | |
| T3 | Standard Prompt | 67.7 | 63.6 |
| IVCF-LLM | 89.5 | 86.4 | |
| T4 | Standard Prompt | 85.0 | 68.8 |
| IVCF-LLM | 98.0 | 97.4 | |
| T5 | Standard Prompt | 89.0 | 75.8 |
| IVCF-LLM | 99.1 | 98.0 |
Tab. 5 Ablation experimental results
| 阶段 | 分类器 | Acc | MCC |
|---|---|---|---|
| T2 | IVCF-LLM(未优化) | 55.9 | 47.8 |
| IVCF-LLM(删除1) | 79.9 | 69.0 | |
| IVCF-LLM(删除2) | 81.5 | 72.0 | |
| IVCF-LLM(删除3) | 78.5 | 68.7 | |
| IVCF-LLM(删除4) | 74.4 | 63.6 | |
| IVCF-LLM | 85.9 | 78.4 | |
| T3 | Standard Prompt | 67.7 | 63.6 |
| IVCF-LLM | 89.5 | 86.4 | |
| T4 | Standard Prompt | 85.0 | 68.8 |
| IVCF-LLM | 98.0 | 97.4 | |
| T5 | Standard Prompt | 89.0 | 75.8 |
| IVCF-LLM | 99.1 | 98.0 |
| 层级 | 总样本数N | 类别数 | |||
|---|---|---|---|---|---|
| DT2-init | 14 498 | 21 | 1 600 | 8 240 | 8 240 |
| DT2-new | 1 605 | 26 | 480 | 562 | 563 |
| DT3 | 14 453 | 124 | 0 | 0 | 14 453 |
| DT4 | 6 786 | 65 | 0 | 0 | 6 786 |
| DT5 | 3 564 | 36 | 0 | 0 | 3 564 |
Tab. 6 Matching statistics of cross-scenario vulnerability subset
| 层级 | 总样本数N | 类别数 | |||
|---|---|---|---|---|---|
| DT2-init | 14 498 | 21 | 1 600 | 8 240 | 8 240 |
| DT2-new | 1 605 | 26 | 480 | 562 | 563 |
| DT3 | 14 453 | 124 | 0 | 0 | 14 453 |
| DT4 | 6 786 | 65 | 0 | 0 | 6 786 |
| DT5 | 3 564 | 36 | 0 | 0 | 3 564 |
| 阶段 | 分类器 | Acc | MCC |
|---|---|---|---|
| 层次判断 | SCP | 85.5 | 37.8 |
| 顶层分类 | Text2Weak(3-small) | 38.9 | 37.8 |
| Text2Weak(3-small)(top5) | 71.8 | 61.0 | |
| SCP | 68.5 | 64.4 | |
| IVCF-LLM(GLM-4-Flash) | 72.1 | 59.8 | |
| IVCF-LLM(GPT-3.5 Turbo) | 79.6 | 68.4 | |
| 子层分类 | SCP | 55.3 | 45.7 |
| IVCF-LLM(GLM-4-Flash) | 83.7 | 64.5 | |
| IVCF-LLM(GPT-3.5 Turbo) | 86.7 | 67.0 | |
| 全局分类 | Text2Weak(3-small) | 20.3 | 10.6 |
| Text2Weak(3-small)(top5) | 36.5 | 25.0 | |
| Text2Weak(add-002) | 17.5 | 8.0 | |
| Text2Weak(add-002)(top5) | 27.0 | 14.3 | |
| SCP | 53.2 | 52.6 | |
| Prompt(GPT-3.5 Turbo) | 52.6 | 48.9 | |
| IVCF-LLM(GLM-4-Flash) | 60.2 | 51.3 | |
| IVCF-LLM(GPT-3.5 Turbo) | 69.1 | 60.7 |
Tab. 7 Robustness test results of cross-scenario CWE features
| 阶段 | 分类器 | Acc | MCC |
|---|---|---|---|
| 层次判断 | SCP | 85.5 | 37.8 |
| 顶层分类 | Text2Weak(3-small) | 38.9 | 37.8 |
| Text2Weak(3-small)(top5) | 71.8 | 61.0 | |
| SCP | 68.5 | 64.4 | |
| IVCF-LLM(GLM-4-Flash) | 72.1 | 59.8 | |
| IVCF-LLM(GPT-3.5 Turbo) | 79.6 | 68.4 | |
| 子层分类 | SCP | 55.3 | 45.7 |
| IVCF-LLM(GLM-4-Flash) | 83.7 | 64.5 | |
| IVCF-LLM(GPT-3.5 Turbo) | 86.7 | 67.0 | |
| 全局分类 | Text2Weak(3-small) | 20.3 | 10.6 |
| Text2Weak(3-small)(top5) | 36.5 | 25.0 | |
| Text2Weak(add-002) | 17.5 | 8.0 | |
| Text2Weak(add-002)(top5) | 27.0 | 14.3 | |
| SCP | 53.2 | 52.6 | |
| Prompt(GPT-3.5 Turbo) | 52.6 | 48.9 | |
| IVCF-LLM(GLM-4-Flash) | 60.2 | 51.3 | |
| IVCF-LLM(GPT-3.5 Turbo) | 69.1 | 60.7 |
| 任务类型 | 调用频次 | 平均响应时间/s | 平均Token消耗 | |
|---|---|---|---|---|
| 输入 | 输出 | |||
| 技能生成 | 24 | 11.69±0.6 | 9 259 | 598 |
| 技能融合 | 4 | 10.83±0.7 | 2 825 | 936 |
Tab. 8 Distribution of resource consumption
| 任务类型 | 调用频次 | 平均响应时间/s | 平均Token消耗 | |
|---|---|---|---|---|
| 输入 | 输出 | |||
| 技能生成 | 24 | 11.69±0.6 | 9 259 | 598 |
| 技能融合 | 4 | 10.83±0.7 | 2 825 | 936 |
| [1] | 刘玥. 视频监控脆弱性检测系统的设计与实现[J]. 现代信息科技, 2024, 8(24): 158-162, 170. |
| LIU Y. Design and implementation of video surveillance vulnerability detection system[J]. Modern Information Technology, 2024, 8(24): 158-162, 170. | |
| [2] | YOU Y, JIANG J, JIANG Z, et al. TIM: threat context-enhanced TTP intelligence mining on unstructured threat data[J]. Cybersecurity, 2022, 5: No.3. |
| [3] | HADDAD A, AARAJ N, NAKOV P, et al. Automated mapping of CVE vulnerability records to MITRE CWE weaknesses[EB/OL]. [2024-11-14].. |
| [4] | WANG Q, GAO Y, REN J, et al. An automatic classification algorithm for software vulnerability based on weighted word vector and fusion neural network[J]. Computers and Security, 2023, 126: No.103070. |
| [5] | KOTA K, MANJUNATHA A, SREE V S. CWE prediction using CVE description — the semantic similarity approach[J]. Procedia Computer Science, 2024, 235: 1167-1178. |
| [6] | SIMONETTO S, OOSTVEEN R, VAN EDE T, et al. Text2Weak: mapping CVEs to CWEs using description embeddings analysis[EB/OL]. [2024-11-15].. |
| [7] | OOSTVEEN R. CWE-ASSIST: a framework for automating CWE classification[D/OL]. [2024-06-20].. |
| [8] | TURTIAINEN H, COSTIN A. VulnBERTa: on automating CWE weakness assignment and improving the quality of cybersecurity CVE vulnerabilities through ML/NLP[C]// Proceedings of the 2024 IEEE European Symposium on Security and Privacy Workshops. Piscataway: IEEE, 2024: 618-625. |
| [9] | DAS S S, DUTTA A, PUROHIT S, et al. Towards automatic mapping of vulnerabilities to attack patterns using large language models[C]// Proceedings of the 2022 IEEE International Symposium on Technologies for Homeland Security. Piscataway: IEEE, 2022: 1-7. |
| [10] | AGHAEI E, AL-SHAER E, SHADID W, et al. Automated CVE analysis for threat prioritization and impact prediction[EB/OL]. [2024-11-15].. |
| [11] | 王晓宇,李欣,胡勉宁,等. 基于大语言模型的CIL-LLM类别增量学习框架[J]. 计算机科学与探索, 2025, 19(2): 374-384. |
| WANG X Y, LI X, HU M N, et al. CIL-LLM: incremental learning framework based on large language models for category classification[J]. Journal of Frontiers of Computer Science and Technology, 2025, 19(2): 374-384. | |
| [12] | Common Weakness Enumeration. CWE view: research concepts[EB/OL]. [2025-01-13].. |
| [13] | D’AUTUME C D M, RUDER S, KONG L, et al. Episodic memory in lifelong language learning[C]// Proceedings of the 33rd International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2019: 13132-13141. |
| [14] | WANG Z, MEHTA S V, PÓCZOS B, et al. Efficient meta lifelong-learning with limited memory[EB/OL]. [2024-09-25].. |
| [15] | KIRKPATRICK J, PASCANU R, RABINOWITZ N, et al. Overcoming catastrophic forgetting in neural networks[J]. Proceedings of the National Academy of Sciences of the United States of America, 2017, 114(13): 3521-3526. |
| [16] | YAN S, XIE J, HE X. DER: dynamically expandable representation for class incremental learning[C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 3013-3022. |
| [17] | YIN W, LI J, XIONG C. ConTinTin: continual learning from task instructions[C]// Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg: ACL, 2022: 3062-3072. |
| [18] | WANG J, DONG D, SHOU L, et al. Effective continual learning for text classification with lightweight snapshots[C]// Proceedings of the 37th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2023: 10122-10130. |
| [19] | THUCNews dataset[DS/OL]. [2024-07-22].. |
| [20] | OpenAI. Introducing GPT-4o and more tools to ChatGPT free users[EB/OL]. [2025-01-13].. |
| [21] | OpenAI. GPT-3.5 Turbo fine-tuning and API updates[EB/OL]. [2025-01-13].. |
| [22] | National Institute of Standards and Technology. National vulnerability database[DB/OL]. [2025-01-13].. |
| [23] | MILOUSI K, KIRIAKIDIS P, MENGIDIS N, et al. Evaluating cybersecurity risk: a comprehensive comparison of vulnerability scoring methodologies[C]// Proceedings of the 19th International Conference on Availability, Reliability and Security. New York: ACM, 2024: No.52. |
| [24] | ZHOU Y, MURESANU A I, HAN Z, et al. Large language models are human-level prompt engineers[EB/OL]. [2024-10-22].. |
| [25] | Povio. GPT-4 Turbo preview: exploring the 128k context window[EB/OL]. [2024-07-22].. |
| [26] | OpenAI. New and improved embedding model[EB/OL]. [2025-01-14].. |
| [27] | Team GLM. ChatGLM: a family of large language models from GLM-130B to GLM-4 all tools[EB/OL]. [2024-07-30].. |
| [28] | 张添植,周刚,张爽,等. 针对图文模态间实体对齐的目标实体情感分类[J]. 计算机工程, 2026, 52(3): 222-233. |
| ZHANG T Z, ZHOU G, ZHANG S, et al. Image-text multimodal entity alignment for target-oriented sentiment classification[J]. Computer Engineering, 2026, 52(3): 222-233. |
| [1] | Kaizhou SHI, Xuan HE, Guoyi HOU, Gen LI, Shuanggao LI, Xiang HUANG. Airborne product metrological traceability knowledge graph construction method based on large language models [J]. Journal of Computer Applications, 2026, 46(4): 1086-1095. |
| [2] | Haoyang ZHANG, Liping ZHANG, Sheng YAN, Na LI, Xuefei ZHANG. Review of large language model methods for knowledge graph completion [J]. Journal of Computer Applications, 2026, 46(3): 683-695. |
| [3] | Enkang XI, Jing FAN, Yadong JIN, Hua DONG, Hao YU, Yihang SUN. Review of threats faced by federated learning in privacy and security field [J]. Journal of Computer Applications, 2026, 46(3): 798-808. |
| [4] | Yiming HUANG, Xihua ZOU, Guo DENG, Di ZHENG. Pre-answering and retrieval filtering: dual-stage optimization method for RAG-based question-answering systems [J]. Journal of Computer Applications, 2026, 46(3): 696-707. |
| [5] | Bin SHEN, Xiaoning CHEN, Hua CHENG, Yiquan FANG, Huifeng WANG. Intelligent undergraduate teaching evaluation system based on large language models [J]. Journal of Computer Applications, 2026, 46(3): 993-1003. |
| [6] | Dingjia WU, Zhe CUI. MG-SQL: SQL generation framework with enhanced schema linking and multi-generator collaboration [J]. Journal of Computer Applications, 2026, 46(3): 723-731. |
| [7] | Rilong WANG, Zhenping LI, Xiaosong LI, Qiang GAO, Ya HE, Yong ZHONG, Yingxiao ZHAO. Multi-Agent collaborative knowledge reasoning framework [J]. Journal of Computer Applications, 2026, 46(3): 708-714. |
| [8] | Yixin LIU, Xianggen LIU, Wen LIU, Hongbo DENG, Ziye ZHANG, Hua MU. Benchmark dataset for retrieval-augmented generation on long documents [J]. Journal of Computer Applications, 2026, 46(2): 386-394. |
| [9] | Fei GAO, Dong CHEN, Dixing BIAN, Wenqiang FAN, Qidong LIU, Pei LYU, Chaoyang ZHANG, Mingliang XU. Multistage coupled decision-making framework for researcher redeployment after discipline revocation [J]. Journal of Computer Applications, 2026, 46(2): 416-426. |
| [10] | Xinran XIE, Zhe CUI, Rui CHEN, Tailai PENG, Dekun LIN. Zero-shot re-ranking method by large language model with hierarchical filtering and label semantic extension [J]. Journal of Computer Applications, 2026, 46(1): 60-68. |
| [11] | Yi LIN, Bing XIA, Yong WANG, Shunda MENG, Juchong LIU, Shuqin ZHANG. AI-Agent based method for hidden RESTful API discovery and vulnerability detection [J]. Journal of Computer Applications, 2026, 46(1): 135-143. |
| [12] | Binbin ZHANG, Yongbin QIN, Ruizhang HUANG, Yanping CHEN. Judgment document summarization method combining large language model and dynamic prompts [J]. Journal of Computer Applications, 2025, 45(9): 2783-2789. |
| [13] | Tao FENG, Chen LIU. Dual-stage prompt tuning method for automated preference alignment [J]. Journal of Computer Applications, 2025, 45(8): 2442-2447. |
| [14] | Yiheng SUN, Maofu LIU. Tender information extraction method based on prompt tuning of knowledge [J]. Journal of Computer Applications, 2025, 45(4): 1169-1176. |
| [15] | Peng CAO, Guangqi WEN, Jinzhu YANG, Gang CHEN, Xinyi LIU, Xuechun JI. Efficient fine-tuning method of large language models for test case generation [J]. Journal of Computer Applications, 2025, 45(3): 725-731. |
| Viewed | ||||||
|
Full text |
|
|||||
|
Abstract |
|
|||||