Journal of Computer Applications ›› 2026, Vol. 46 ›› Issue (2): 416-426.DOI: 10.11772/j.issn.1001-9081.2025030271
• Artificial intelligence • Previous Articles
Fei GAO1, Dong CHEN1,2,3,4, Dixing BIAN1, Wenqiang FAN1, Qidong LIU1,2,3,4, Pei LYU1,2,3,4, Chaoyang ZHANG1,2,3,4, Mingliang XU1,2,3,4(
)
Received:2025-03-18
Revised:2025-05-10
Accepted:2025-05-13
Online:2025-06-05
Published:2026-02-10
Contact:
Mingliang XU
About author:GAO Fei, born in 2001, M. S. candidate. His research interests include natural language processing.Supported by:
高飞1, 陈董1,2,3,4, 边帝行1, 范文强1, 刘起东1,2,3,4, 吕培1,2,3,4, 张朝阳1,2,3,4, 徐明亮1,2,3,4(
)
通讯作者:
徐明亮
作者简介:高飞(2001—),男,河南洛阳人,硕士研究生,CCF会员,主要研究方向:自然语言处理基金资助:CLC Number:
Fei GAO, Dong CHEN, Dixing BIAN, Wenqiang FAN, Qidong LIU, Pei LYU, Chaoyang ZHANG, Mingliang XU. Multistage coupled decision-making framework for researcher redeployment after discipline revocation[J]. Journal of Computer Applications, 2026, 46(2): 416-426.
高飞, 陈董, 边帝行, 范文强, 刘起东, 吕培, 张朝阳, 徐明亮. 面向学科撤销后科研人员重分配的多阶段耦合决策框架[J]. 《计算机应用》唯一官方网站, 2026, 46(2): 416-426.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2025030271
| 符号 | 含义说明 |
|---|---|
| C | 学科集合,包含所有可能的学科 |
| K | 召回阶段Top-K值,用于确定候选学科数 |
| ARM_Recall | 关联召回模型,基于预训练BERT模型 |
| IoM_Optimize | 隐式优化模块,生成科研信息的不同表述 |
| FGRM_Score | 细粒度排序模型,基于人工标注数据集微调 |
Tab. 1 Description of symbols
| 符号 | 含义说明 |
|---|---|
| C | 学科集合,包含所有可能的学科 |
| K | 召回阶段Top-K值,用于确定候选学科数 |
| ARM_Recall | 关联召回模型,基于预训练BERT模型 |
| IoM_Optimize | 隐式优化模块,生成科研信息的不同表述 |
| FGRM_Score | 细粒度排序模型,基于人工标注数据集微调 |
| 关联程度 | 分数范围 | 数量 | 总计 |
|---|---|---|---|
| 无关联 | (0,0.2] | 326 | 326 |
| 一般关联 | (0.2,0.4] | 537 | 1 262 |
| (0.4,0.6] | 395 | ||
| (0.6,0.8] | 330 | ||
| 强关联 | (0.8,1.0] | 384 | 384 |
Tab. 2 Tier distribution of manually annotated data
| 关联程度 | 分数范围 | 数量 | 总计 |
|---|---|---|---|
| 无关联 | (0,0.2] | 326 | 326 |
| 一般关联 | (0.2,0.4] | 537 | 1 262 |
| (0.4,0.6] | 395 | ||
| (0.6,0.8] | 330 | ||
| 强关联 | (0.8,1.0] | 384 | 384 |
| 类别 | 召回阶段数据 | 重排阶段人工数据 | 重排阶段完整数据 |
|---|---|---|---|
| 总计 | 6 008 | 1 988 | 15 904 |
| 训练集 | 4 806 | 1 590 | 12 720 |
| 验证集 | 601 | 199 | 1 592 |
| 测试集 | 601 | 199 | 1 592 |
Tab. 3 DSR-WCC dataset
| 类别 | 召回阶段数据 | 重排阶段人工数据 | 重排阶段完整数据 |
|---|---|---|---|
| 总计 | 6 008 | 1 988 | 15 904 |
| 训练集 | 4 806 | 1 590 | 12 720 |
| 验证集 | 601 | 199 | 1 592 |
| 测试集 | 601 | 199 | 1 592 |
| 数据集 | 模型 | 评价指标 | ||||
|---|---|---|---|---|---|---|
Acc (↑) | Precision(↑) | Recall(↑) | F1(↑) | MSE(↓) | ||
| DSR⁃WCC | MCRF | 0.96 | 0.94 | 0.93 | 0.93 | 0.01 |
MCRF (w/oDA) | 0.95 | 0.92 | 0.90 | 0.91 | 0.01 | |
| GLM-4 | 0.22 | 0.61 | 0.21 | 0.15 | 0.17 | |
| GPT-3.5 | 0.28 | 0.55 | 0.28 | 0.25 | 0.15 | |
| GPT-4o | 0.38 | 0.62 | 0.37 | 0.37 | 0.12 | |
| DSP | MCRF | 0.90 | 0.89 | 0.89 | 0.90 | 0.02 |
MCRF (w/oDA) | 0.88 | 0.85 | 0.88 | 0.90 | 0.04 | |
| GLM-4 | 0.25 | 0.56 | 0.21 | 0.18 | 0.18 | |
| GPT-3.5 | 0.28 | 0.58 | 0.28 | 0.25 | 0.15 | |
| GPT-4o | 0.40 | 0.61 | 0.39 | 0.40 | 0.11 | |
Tab. 4 Evaluation metrics comparison of different models on test set
| 数据集 | 模型 | 评价指标 | ||||
|---|---|---|---|---|---|---|
Acc (↑) | Precision(↑) | Recall(↑) | F1(↑) | MSE(↓) | ||
| DSR⁃WCC | MCRF | 0.96 | 0.94 | 0.93 | 0.93 | 0.01 |
MCRF (w/oDA) | 0.95 | 0.92 | 0.90 | 0.91 | 0.01 | |
| GLM-4 | 0.22 | 0.61 | 0.21 | 0.15 | 0.17 | |
| GPT-3.5 | 0.28 | 0.55 | 0.28 | 0.25 | 0.15 | |
| GPT-4o | 0.38 | 0.62 | 0.37 | 0.37 | 0.12 | |
| DSP | MCRF | 0.90 | 0.89 | 0.89 | 0.90 | 0.02 |
MCRF (w/oDA) | 0.88 | 0.85 | 0.88 | 0.90 | 0.04 | |
| GLM-4 | 0.25 | 0.56 | 0.21 | 0.18 | 0.18 | |
| GPT-3.5 | 0.28 | 0.58 | 0.28 | 0.25 | 0.15 | |
| GPT-4o | 0.40 | 0.61 | 0.39 | 0.40 | 0.11 | |
| 数据集 | 模型 | 评价指标 | ||
|---|---|---|---|---|
| Acc(↑) | Precision(↑) | 平均时间/s(↓) | ||
| DSR⁃WCC | MCRF | 0.96 | 0.94 | 37.8 |
| w/o RC | 0.97 | 0.96 | 1 371.6 | |
| DSP | MCRF | 0.90 | 0.89 | 41.2 |
| w/o RC | 0.92 | 0.89 | 1 486.2 | |
Tab. 5 Ablation study in recall stage
| 数据集 | 模型 | 评价指标 | ||
|---|---|---|---|---|
| Acc(↑) | Precision(↑) | 平均时间/s(↓) | ||
| DSR⁃WCC | MCRF | 0.96 | 0.94 | 37.8 |
| w/o RC | 0.97 | 0.96 | 1 371.6 | |
| DSP | MCRF | 0.90 | 0.89 | 41.2 |
| w/o RC | 0.92 | 0.89 | 1 486.2 | |
| 数据集 | 模型 | 评价指标 | ||||
|---|---|---|---|---|---|---|
| Acc | Precision | Recall | F1 | MSE | ||
| DSR⁃WCC | MCRF | 0.96 | 0.94 | 0.93 | 0.93 | 0.01 |
| w/o IoM | 0.92 | 0.91 | 0.90 | 0.91 | 0.01 | |
| DSP | MCRF | 0.90 | 0.89 | 0.89 | 0.90 | 0.02 |
| w/o IoM | 0.86 | 0.80 | 0.82 | 0.86 | 0.08 | |
Tab. 6 Ablation study on implicit optimization module
| 数据集 | 模型 | 评价指标 | ||||
|---|---|---|---|---|---|---|
| Acc | Precision | Recall | F1 | MSE | ||
| DSR⁃WCC | MCRF | 0.96 | 0.94 | 0.93 | 0.93 | 0.01 |
| w/o IoM | 0.92 | 0.91 | 0.90 | 0.91 | 0.01 | |
| DSP | MCRF | 0.90 | 0.89 | 0.89 | 0.90 | 0.02 |
| w/o IoM | 0.86 | 0.80 | 0.82 | 0.86 | 0.08 | |
| 数据集 | 模型 | 关键词数 | 评价指标 | ||
|---|---|---|---|---|---|
| Acc | Precision | F1 | |||
| DSR⁃WCC | MCRF | — | 0.96 | 0.94 | 0.93 |
| RM | 100 | 0.10 | 0.09 | 0.11 | |
| 1 000 | 0.20 | 0.20 | 0.18 | ||
| DSP | MCRF | — | 0.90 | 0.89 | 0.90 |
| RM | 100 | 0.13 | 0.12 | 0.12 | |
| 1 000 | 0.22 | 0.21 | 0.20 | ||
Tab. 7 Comparison between MCRF and rule-based method
| 数据集 | 模型 | 关键词数 | 评价指标 | ||
|---|---|---|---|---|---|
| Acc | Precision | F1 | |||
| DSR⁃WCC | MCRF | — | 0.96 | 0.94 | 0.93 |
| RM | 100 | 0.10 | 0.09 | 0.11 | |
| 1 000 | 0.20 | 0.20 | 0.18 | ||
| DSP | MCRF | — | 0.90 | 0.89 | 0.90 |
| RM | 100 | 0.13 | 0.12 | 0.12 | |
| 1 000 | 0.22 | 0.21 | 0.20 | ||
| 教师 | 项目名称 | Top-5召回结果 | 真实标签 |
|---|---|---|---|
| A | 智能音视频控制设备与云平台服务管理系统 | 1)计算机科学与技术 2)软件工程 3)机械工程 4)网络安全 5)电气工程 | 计算机科学与技术 |
| B | 新型超硬碳材料的高压合成、结构及其相变机制研究 | 1)材料科学与工程 2)化学 3)冶金工程 4)化学工程与技术 5)物理学 | 物理学 |
Tab. 8 Case analysis for recall stage
| 教师 | 项目名称 | Top-5召回结果 | 真实标签 |
|---|---|---|---|
| A | 智能音视频控制设备与云平台服务管理系统 | 1)计算机科学与技术 2)软件工程 3)机械工程 4)网络安全 5)电气工程 | 计算机科学与技术 |
| B | 新型超硬碳材料的高压合成、结构及其相变机制研究 | 1)材料科学与工程 2)化学 3)冶金工程 4)化学工程与技术 5)物理学 | 物理学 |
| 教师 | 项目名称 | 隐式优化结果 | 候选学科项目 | 匹配分数 | 最终得分 | 真实得分 |
|---|---|---|---|---|---|---|
| A | 模块-钢框架支撑子母结构抗震分阶保护机制与设计方法研究 | 1)钢框架支撑子母结构的抗震分级保护设计原理与方法研究 2)基于分阶保护策略的钢框架支撑子母结构抗震性能优化与设计方案 3)钢框架支撑子母结构在抗震中的分阶段保护机制及其设计技术研究 | 寒冷地区城市住区冬季室外热舒适特征研究 | 0.80 0.70 0.60 0.60 | 0.68 | 0.60 |
Tab. 9 Case analysis for reordering stage
| 教师 | 项目名称 | 隐式优化结果 | 候选学科项目 | 匹配分数 | 最终得分 | 真实得分 |
|---|---|---|---|---|---|---|
| A | 模块-钢框架支撑子母结构抗震分阶保护机制与设计方法研究 | 1)钢框架支撑子母结构的抗震分级保护设计原理与方法研究 2)基于分阶保护策略的钢框架支撑子母结构抗震性能优化与设计方案 3)钢框架支撑子母结构在抗震中的分阶段保护机制及其设计技术研究 | 寒冷地区城市住区冬季室外热舒适特征研究 | 0.80 0.70 0.60 0.60 | 0.68 | 0.60 |
| [1] | 李平. 调整优化学科专业 造就拔尖创新人才[N]. 山西日报, 2024-08-13(10). |
| LI P. Adjust and optimize disciplines to create top-notch innovative talents[N]. Shanxi Daily, 2024-08-13(10). | |
| [2] | 刘国瑞. 关于高等学校学科结构调整的再认识[J]. 大学与学科, 2021, 2(1): 72-81. |
| LIU G R. Rethinking the structure adjustment of disciplines in higher education institutes[J]. Universities and Disciplines, 2021, 2(1): 72-81. | |
| [3] | 张宏民. 学科建设基本问题述评[J]. 科学与管理, 2008, 28(1): 24-27. |
| ZHANG H M. A review of the basic issues of discipline construction[J]. Science and Management, 2008, 28(1): 24-27. | |
| [4] | 马腾,陈庶樵,张校辉,等. 基于规则集划分的多决策树报文分类算法[J]. 计算机应用, 2013, 33(9): 2450-2454. |
| MA T, CHEN S Q, ZHANG X H, et al. Multiple decision-tree packet classification algorithm based on rule set partitioning[J]. Journal of Computer Applications, 2013, 33(9): 2450-2454. | |
| [5] | HEARST M A, DUMAIS S T, OSUNA E, et al. Support vector machines[J]. IEEE Intelligent Systems and their Applications, 1998, 13(4): 18-28. |
| [6] | 贾晓帆,何利力. 融合朴素贝叶斯与决策树的用户评论分类算法[J].软件导刊, 2021, 20(7): 1-5. |
| JIA X F, HE L L. User comment classification algorithm based on naive Bayes and decision tree[J]. Software Guide, 2021, 20(7): 1-5. | |
| [7] | GROOTENDORST M. BERTopic: neural topic modeling with a class-based TF-IDF procedure[EB/OL]. [2024-12-02].. |
| [8] | KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[C]// Proceedings of the 26th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2012: 1097-1105. |
| [9] | DU J, VONG C M, CHEN C L P. Novel efficient RNN and LSTM-like architectures: recurrent and gated broad learning systems and their applications for text classification[J]. IEEE Transactions on Cybernetics, 2021, 51(3): 1586-1597. |
| [10] | KIM Y. Convolutional neural networks for sentence classification[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2014: 1746-1751. |
| [11] | LIU P, QIU X, HUANG X. Recurrent neural network for text classification with multi-task learning[C]// Proceedings of the 25th International Joint Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2016: 2873-2879. |
| [12] | PETERS M E, NEUMANN M, IYYER M, et al. Deep contextualized word representations[C]// Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Stroudsburg: ACL, 2018: 2227-2237. |
| [13] | DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional Transformers for language understanding[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Stroudsburg: ACL, 2019: 4171-4186. |
| [14] | LI Q, PENG H, LI J, et al. A survey on text classification: From traditional to deep learning[J]. ACM Transactions on Intelligent Systems and Technology, 2022, 13(2): No.31. |
| [15] | 马式琨,滕冲,李霏,等. 基于领域特征提纯的多领域文本分类[J]. 中文信息学报, 2022, 36(8): 92-100. |
| MA S K, TENG C, LI F, et al. Multi-domain text classification based on domain feature purification[J]. Journal of Chinese Information Processing, 2022, 36(8): 92-100. | |
| [16] | 徐土杰,陈清财. 非样本均衡细粒度金融要素抽取研究[J]. 中文信息学报, 2024, 38(5): 88-98. |
| XU T J, CHEN Q C. Non-sample equilibrium fine-grained financial element extraction[J]. Journal of Chinese Information Processing, 2024, 38(5): 88-98. | |
| [17] | YANG L, JIANG H, SONG Q, et al. A survey on long-tailed visual recognition[J]. International Journal of Computer Vision, 2022, 130(7): 1837-1872. |
| [18] | BROWN T B, MANN B, RYDER N, et al. Language models are few-shot learners[C]// Proceedings of the 34th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2020: 1877-1901. |
| [19] | 徐瑞,曾诚,程世杰,等. 基于双三元组网络的易混淆文本情感分类方法[J]. 中文信息学报, 2024, 38(1): 135-145. |
| XU R, ZENG C, CHENG S J, et al. Double triplet network for confusing text sentiment classification[J]. Journal of Chinese Information Processing, 2024, 38(1): 135-145. | |
| [20] | MAO A, MOHRI M, ZHONG Y. Cross-entropy loss functions: theoretical analysis and applications[C]// Proceedings of the 40th International Conference on Machine Learning. New York: JMLR.org, 2023: 23803-23828. |
| [21] | HADSELL R, CHOPRA S, LeCUN Y. Dimensionality reduction by learning an invariant mapping[C]// Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2006: 1735-1742. |
| [22] | HU B, CHEN Q, ZHU F. LCSTS: a large scale Chinese short text summarization dataset[C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2015: 1967-1972. |
| [23] | WEI J, ZOU K. EDA: easy data augmentation techniques for boosting performance on text classification tasks[C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Stroudsburg: ACL, 2019: 6382-6388. |
| [24] | XIE Q, DAI Z, HOVY E, et al. Unsupervised data augmentation for consistency training[C]// Proceedings of the 34th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2020: 6256-6268. |
| [25] | JIANG Z, XU F F, GAO L, et al. Active retrieval augmented generation[C]// Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2023: 7969-7992. |
| [26] | Team GLM. ChatGLM: a family of large language models from GLM-130B to GLM-4 all tools[EB/OL]. [2024-09-23].. |
| [27] | HU E J, SHEN Y, WALLIS P, et al. LoRA: low-rank adaptation of large language models[EB/OL]. [2024-09-23].. |
| [28] | CHEN J, XIAO S, ZHANG P, et al. M3-Embedding: multi-linguality, multi-functionality, multi-granularity text embeddings through self-knowledge distillation[EB/OL]. [2024-06-28].. |
| [1] | Yixin LIU, Xianggen LIU, Wen LIU, Hongbo DENG, Ziye ZHANG, Hua MU. Benchmark dataset for retrieval-augmented generation on long documents [J]. Journal of Computer Applications, 2026, 46(2): 386-394. |
| [2] | Yi LIN, Bing XIA, Yong WANG, Shunda MENG, Juchong LIU, Shuqin ZHANG. AI-Agent based method for hidden RESTful API discovery and vulnerability detection [J]. Journal of Computer Applications, 2026, 46(1): 135-143. |
| [3] | Xinran XIE, Zhe CUI, Rui CHEN, Tailai PENG, Dekun LIN. Zero-shot re-ranking method by large language model with hierarchical filtering and label semantic extension [J]. Journal of Computer Applications, 2026, 46(1): 60-68. |
| [4] | Binbin ZHANG, Yongbin QIN, Ruizhang HUANG, Yanping CHEN. Judgment document summarization method combining large language model and dynamic prompts [J]. Journal of Computer Applications, 2025, 45(9): 2783-2789. |
| [5] | Tao FENG, Chen LIU. Dual-stage prompt tuning method for automated preference alignment [J]. Journal of Computer Applications, 2025, 45(8): 2442-2447. |
| [6] | Yiheng SUN, Maofu LIU. Tender information extraction method based on prompt tuning of knowledge [J]. Journal of Computer Applications, 2025, 45(4): 1169-1176. |
| [7] | Jing HE, Yang SHEN, Runfeng XIE. Recognition and optimization of hallucination phenomena in large language models [J]. Journal of Computer Applications, 2025, 45(3): 709-714. |
| [8] | Peng CAO, Guangqi WEN, Jinzhu YANG, Gang CHEN, Xinyi LIU, Xuechun JI. Efficient fine-tuning method of large language models for test case generation [J]. Journal of Computer Applications, 2025, 45(3): 725-731. |
| [9] | Xiaolin QIN, Xu GU, Dicheng LI, Haiwen XU. Survey and prospect of large language models [J]. Journal of Computer Applications, 2025, 45(3): 685-696. |
| [10] | Chengzhe YUAN, Guohua CHEN, Dingding LI, Yuan ZHU, Ronghua LIN, Hao ZHONG, Yong TANG. ScholatGPT: a large language model for academic social networks and its intelligent applications [J]. Journal of Computer Applications, 2025, 45(3): 755-764. |
| [11] | Yuemei XU, Yuqi YE, Xueyi HE. Bias challenges of large language models: identification, evaluation, and mitigation [J]. Journal of Computer Applications, 2025, 45(3): 697-708. |
| [12] | Yan YANG, Feng YE, Dong XU, Xuejie ZHANG, Jin XU. Construction of digital twin water conservancy knowledge graph integrating large language model and prompt learning [J]. Journal of Computer Applications, 2025, 45(3): 785-793. |
| [13] | Xuefei ZHANG, Liping ZHANG, Sheng YAN, Min HOU, Yubo ZHAO. Personalized learning recommendation in collaboration of knowledge graph and large language model [J]. Journal of Computer Applications, 2025, 45(3): 773-784. |
| [14] | Chenwei SUN, Junli HOU, Xianggen LIU, Jiancheng LYU. Large language model prompt generation method for engineering drawing understanding [J]. Journal of Computer Applications, 2025, 45(3): 801-807. |
| [15] | Yanmin DONG, Jiajia LIN, Zheng ZHANG, Cheng CHENG, Jinze WU, Shijin WANG, Zhenya HUANG, Qi LIU, Enhong CHEN. Design and practice of intelligent tutoring algorithm based on personalized student capability perception [J]. Journal of Computer Applications, 2025, 45(3): 765-772. |
| Viewed | ||||||
|
Full text |
|
|||||
|
Abstract |
|
|||||