面向学科撤销后科研人员重分配的多阶段耦合决策框架

doi:10.11772/j.issn.1001-9081.2025030271

《计算机应用》唯一官方网站 ›› 2026, Vol. 46 ›› Issue (2): 416-426.DOI: 10.11772/j.issn.1001-9081.2025030271

• 人工智能 • 上一篇

面向学科撤销后科研人员重分配的多阶段耦合决策框架

高飞¹, 陈董¹^,²^,³^,⁴, 边帝行¹, 范文强¹, 刘起东¹^,²^,³^,⁴, 吕培¹^,²^,³^,⁴, 张朝阳¹^,²^,³^,⁴, 徐明亮¹^,²^,³^,⁴()

^1.郑州大学计算机与人工智能学院，郑州 450001
^2.智能集群系统教育部工程研究中心（郑州大学），郑州 450001
^3.国家超级计算郑州中心，郑州 450001
^4.河南省大模型技术与新质软件工程研究中心（郑州大学），郑州 450001

收稿日期:2025-03-18 修回日期:2025-05-10 接受日期:2025-05-13 发布日期:2025-06-05 出版日期:2026-02-10
通讯作者: 徐明亮
作者简介:高飞（2001—），男，河南洛阳人，硕士研究生，CCF会员，主要研究方向：自然语言处理
陈董（1994—），男，河南郑州人，讲师，博士，主要研究方向：大模型决策、大小模型协同、跨媒体异常检测
边帝行（2002—），男，河南郑州人，硕士研究生，主要研究方向：自然语言处理
范文强（1999—），男，河南驻马店人，硕士研究生，主要研究方向：自然语言处理
刘起东（1993—），男，河南新乡人，教授，博士，主要研究方向：强化学习、路径规划
吕培（1986—），男，河南郑州人，教授，博士，主要研究方向：人工智能、虚拟现实、人机融合智能系统
张朝阳（1986—），男，河南郑州人，教授，博士，主要研究方向：人工智能、虚拟现实、人机融合智能系统
徐明亮（1981—），男，河南信阳人，教授，博士，主要研究方向：人工智能、大数据、机器人、工业软件。 Email:iexumingliang@zzu.edu.cn
基金资助:
国家自然科学基金资助项目(62325602);国家自然科学基金资助项目(62036010);国家自然科学基金资助项目(62276238);国家自然科学基金资助项目(U24A20326);河南省教育委员会基金资助项目(25HASTIT034);河南省自然科学基金资助项目(232300421095)

Multistage coupled decision-making framework for researcher redeployment after discipline revocation

Fei GAO¹, Dong CHEN¹^,²^,³^,⁴, Dixing BIAN¹, Wenqiang FAN¹, Qidong LIU¹^,²^,³^,⁴, Pei LYU¹^,²^,³^,⁴, Chaoyang ZHANG¹^,²^,³^,⁴, Mingliang XU¹^,²^,³^,⁴()

^1.School of Computer Science and Artificial Intelligence，Zhengzhou University，Zhengzhou Henan 450001，China
^2.Engineering Research Center of Intelligent Cluster System，Ministry of Education （Zhengzhou University），Zhengzhou Henan 450001，China
^3.National Supercomputing Center in Zhengzhou，Zhengzhou Henan 450001，China
^4.Henan Province Large Model Technology and New Software Engineering Research Center （Zhengzhou University），Zhengzhou Henan 450001，China

Received:2025-03-18 Revised:2025-05-10 Accepted:2025-05-13 Online:2025-06-05 Published:2026-02-10
Contact: Mingliang XU
About author:GAO Fei， born in 2001， M. S. candidate. His research interests include natural language processing.
CHEN Dong， born in 1994， Ph. D.， lecturer. His research interests include large model decision-making， collaboration between large and small models， cross-media anomaly detection.
BIAN Dixing， born in 2002， M. S. candidate. His research interests include natural language processing.
FAN Wenqiang， born in 1999， M. S. candidate. His research interests include natural language processing.
LIU Qidong， born in 1993， Ph. D.， professor. His research interests include reinforcement learning， path planning.
LYU Pei， born in 1986， Ph. D.， professor. His research interests include artificial intelligence， virtual reality， human-machine integrated intelligent systems.
ZHANG Chaoyang， born in 1986， Ph. D.， professor. His research interests include artificial intelligence， virtual reality， human-machine integrated intelligent systems.
XU Mingliang， born in 1981， Ph. D.， professor. His research interests include artificial intelligence， big data， robotics， industrial software. Email:iexumingliang@zzu.edu.cn
Supported by:
National Natural Science Foundation of China(62325602);Foundation of Henan Educational Committee(25HASTIT034);Natural Science Foundation of Henan(232300421095)

摘要/Abstract

摘要：

现有学科撤销后的科研人员重分配依赖人工决策，难以有效统筹学科关联。在此背景下，拥有强大知识分析能力的大语言模型（LLM）为基于学科撤销后的科研人员重分配优化提供了新思路，然而它们在以科研信息为代表的高校科研数据上面临着专业术语难理解和长尾分布明显等挑战。因此，提出一种面向学科撤销后科研人员重分配的多阶段耦合决策框架MCRF（Multistage Coupled Redeployment Framework）。MCRF包含召回、语义增强、配对和重排这4个阶段，能有效地将困难问题分解为多个相对简单的子问题。首先，构建学科科研词云关联数据集，缓解通用模型难以理解专用学术名词的问题；其次，设计关联召回算法，快速召回科研信息的Top-k关联学科，从而降低整体决策的时间开销；最后，引入隐式优化模块，生成多样化的科研信息表述，从而确保尾部学科科研信息能与科研人员研究方向全面关联，并通过细粒度科研项目排序模型实现精准语义匹配。实验结果表明，在多个数据集上，所提框架在召回阶段的召回率达到了92%，在重排阶段的准确率为96%，有效验证了MCRF在学科结构优化任务中的有效性。

关键词: 大语言模型, 科研词云, 学科结构优化, 科研信息, 语义匹配

Abstract:

The existing researcher redeployment after discipline revocation relies on manual decision-making， which makes it difficult to coordinate discipline associations effectively. In this context， Large Language Model （LLM） with strong knowledge analysis capabilities provides new ideas for discipline optimization based on researcher redeployment after discipline revocation. However， on university research data represented by scientific research information， they face challenges such as difficulty in understanding professional terms and obvious long-tail distribution. Therefore， a multistage coupled decision-making framework for the redeployment of researchers after discipline revocation， namely MCRF （Multistage Coupled Redeployment Framework）， was proposed. MCRF was composed of four stages： recall， semantic enhancement， pairing， and reordering， and was able to decompose difficult problems into multiple relatively simple sub-problems effectively. Firstly， a discipline research word cloud association dataset was constructed to alleviate the problem of general models’ difficulty in understanding specialized academic terms. Secondly， an association recall algorithm was designed to recall Top-K related disciplines of scientific research information quickly， thereby reducing the overall decision-making time overhead. Finally， an implicit optimization module was introduced to generate diverse representations of scientific research information， thereby ensuring that tail discipline research information was able to be fully associated with researchers’ research directions， and accurate semantic matching was achieved through a fine-grained scientific research project ordering model. Experimental results show that on multiple datasets， the recall of the proposed framework reaches 92% in the recall stage， and the accuracy of the proposed framework is 96% in the reordering stage， verifying the effectiveness of MCRF in the task of discipline structure optimization effectively.

Key words: Large Language Model (LLM), scientific research word cloud, discipline structure optimization, scientific research information, semantic matching

中图分类号:

TP391.1

高飞, 陈董, 边帝行, 范文强, 刘起东, 吕培, 张朝阳, 徐明亮. 面向学科撤销后科研人员重分配的多阶段耦合决策框架[J]. 计算机应用, 2026, 46(2): 416-426.

Fei GAO, Dong CHEN, Dixing BIAN, Wenqiang FAN, Qidong LIU, Pei LYU, Chaoyang ZHANG, Mingliang XU. Multistage coupled decision-making framework for researcher redeployment after discipline revocation[J]. Journal of Computer Applications, 2026, 46(2): 416-426.

图/表 19

参考文献 28

[1]	李平. 调整优化学科专业造就拔尖创新人才［N］. 山西日报， 2024-08-13（10）.
	LI P. Adjust and optimize disciplines to create top-notch innovative talents［N］. Shanxi Daily， 2024-08-13（10）.
[2]	刘国瑞. 关于高等学校学科结构调整的再认识［J］. 大学与学科， 2021， 2（1）： 72-81.
	LIU G R. Rethinking the structure adjustment of disciplines in higher education institutes［J］. Universities and Disciplines， 2021， 2（1）： 72-81.
[3]	张宏民. 学科建设基本问题述评［J］. 科学与管理， 2008， 28（1）： 24-27.
	ZHANG H M. A review of the basic issues of discipline construction［J］. Science and Management， 2008， 28（1）： 24-27.
[4]	马腾，陈庶樵，张校辉，等. 基于规则集划分的多决策树报文分类算法［J］. 计算机应用， 2013， 33（9）： 2450-2454.
	MA T， CHEN S Q， ZHANG X H， et al. Multiple decision-tree packet classification algorithm based on rule set partitioning［J］. Journal of Computer Applications， 2013， 33（9）： 2450-2454.
[5]	HEARST M A， DUMAIS S T， OSUNA E， et al. Support vector machines［J］. IEEE Intelligent Systems and their Applications， 1998， 13（4）： 18-28.
[6]	贾晓帆，何利力. 融合朴素贝叶斯与决策树的用户评论分类算法［J］.软件导刊， 2021， 20（7）： 1-5.
	JIA X F， HE L L. User comment classification algorithm based on naive Bayes and decision tree［J］. Software Guide， 2021， 20（7）： 1-5.
[7]	GROOTENDORST M. BERTopic： neural topic modeling with a class-based TF-IDF procedure［EB/OL］. ［2024-12-02］..
[8]	KRIZHEVSKY A， SUTSKEVER I， HINTON G E. ImageNet classification with deep convolutional neural networks［C］// Proceedings of the 26th International Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2012： 1097-1105.
[9]	DU J， VONG C M， CHEN C L P. Novel efficient RNN and LSTM-like architectures： recurrent and gated broad learning systems and their applications for text classification［J］. IEEE Transactions on Cybernetics， 2021， 51（3）： 1586-1597.
[10]	KIM Y. Convolutional neural networks for sentence classification［C］// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg： ACL， 2014： 1746-1751.
[11]	LIU P， QIU X， HUANG X. Recurrent neural network for text classification with multi-task learning［C］// Proceedings of the 25th International Joint Conference on Artificial Intelligence. Palo Alto： AAAI Press， 2016： 2873-2879.
[12]	PETERS M E， NEUMANN M， IYYER M， et al. Deep contextualized word representations［C］// Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics： Human Language Technologies， Volume 1 （Long Papers）. Stroudsburg： ACL， 2018： 2227-2237.
[13]	DEVLIN J， CHANG M W， LEE K， et al. BERT： pre-training of deep bidirectional Transformers for language understanding［C］// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics： Human Language Technologies， Volume 1 （Long and Short Papers）. Stroudsburg： ACL， 2019： 4171-4186.
[14]	LI Q， PENG H， LI J， et al. A survey on text classification： From traditional to deep learning［J］. ACM Transactions on Intelligent Systems and Technology， 2022， 13（2）： No.31.
[15]	马式琨，滕冲，李霏，等. 基于领域特征提纯的多领域文本分类［J］. 中文信息学报， 2022， 36（8）： 92-100.
	MA S K， TENG C， LI F， et al. Multi-domain text classification based on domain feature purification［J］. Journal of Chinese Information Processing， 2022， 36（8）： 92-100.
[16]	徐土杰，陈清财. 非样本均衡细粒度金融要素抽取研究［J］. 中文信息学报， 2024， 38（5）： 88-98.
	XU T J， CHEN Q C. Non-sample equilibrium fine-grained financial element extraction［J］. Journal of Chinese Information Processing， 2024， 38（5）： 88-98.
[17]	YANG L， JIANG H， SONG Q， et al. A survey on long-tailed visual recognition［J］. International Journal of Computer Vision， 2022， 130（7）： 1837-1872.
[18]	BROWN T B， MANN B， RYDER N， et al. Language models are few-shot learners［C］// Proceedings of the 34th International Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2020： 1877-1901.
[19]	徐瑞，曾诚，程世杰，等. 基于双三元组网络的易混淆文本情感分类方法［J］. 中文信息学报， 2024， 38（1）： 135-145.
	XU R， ZENG C， CHENG S J， et al. Double triplet network for confusing text sentiment classification［J］. Journal of Chinese Information Processing， 2024， 38（1）： 135-145.
[20]	MAO A， MOHRI M， ZHONG Y. Cross-entropy loss functions： theoretical analysis and applications［C］// Proceedings of the 40th International Conference on Machine Learning. New York： JMLR.org， 2023： 23803-23828.
[21]	HADSELL R， CHOPRA S， LeCUN Y. Dimensionality reduction by learning an invariant mapping［C］// Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2006： 1735-1742.
[22]	HU B， CHEN Q， ZHU F. LCSTS： a large scale Chinese short text summarization dataset［C］// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Stroudsburg： ACL， 2015： 1967-1972.
[23]	WEI J， ZOU K. EDA： easy data augmentation techniques for boosting performance on text classification tasks［C］// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Stroudsburg： ACL， 2019： 6382-6388.
[24]	XIE Q， DAI Z， HOVY E， et al. Unsupervised data augmentation for consistency training［C］// Proceedings of the 34th International Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2020： 6256-6268.
[25]	JIANG Z， XU F F， GAO L， et al. Active retrieval augmented generation［C］// Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. Stroudsburg： ACL， 2023： 7969-7992.
[26]	Team GLM. ChatGLM： a family of large language models from GLM-130B to GLM-4 all tools［EB/OL］. ［2024-09-23］..
[27]	HU E J， SHEN Y， WALLIS P， et al. LoRA： low-rank adaptation of large language models［EB/OL］. ［2024-09-23］..
[28]	CHEN J， XIAO S， ZHANG P， et al. M3-Embedding： multi-linguality， multi-functionality， multi-granularity text embeddings through self-knowledge distillation［EB/OL］. ［2024-06-28］..

符号	含义说明
C	学科集合，包含所有可能的学科
K	召回阶段Top-K值，用于确定候选学科数
ARM_Recall	关联召回模型，基于预训练BERT模型
IoM_Optimize	隐式优化模块，生成科研信息的不同表述
FGRM_Score	细粒度排序模型，基于人工标注数据集微调

符号	含义说明
C	学科集合，包含所有可能的学科
K	召回阶段Top-K值，用于确定候选学科数
ARM_Recall	关联召回模型，基于预训练BERT模型
IoM_Optimize	隐式优化模块，生成科研信息的不同表述
FGRM_Score	细粒度排序模型，基于人工标注数据集微调

关联程度	分数范围	数量	总计
无关联	（0，0.2］	326	326
一般关联	（0.2，0.4］	537	1 262
	（0.4，0.6］	395
	（0.6，0.8］	330
强关联	（0.8，1.0］	384	384

关联程度	分数范围	数量	总计
无关联	（0，0.2］	326	326
一般关联	（0.2，0.4］	537	1 262
	（0.4，0.6］	395
	（0.6，0.8］	330
强关联	（0.8，1.0］	384	384

类别	召回阶段数据	重排阶段人工数据	重排阶段完整数据
总计	6 008	1 988	15 904
训练集	4 806	1 590	12 720
验证集	601	199	1 592
测试集	601	199	1 592

面向学科撤销后科研人员重分配的多阶段耦合决策框架

Multistage coupled decision-making framework for researcher redeployment after discipline revocation

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 19

参考文献 28

相关文章 15

编辑推荐

Metrics

数据集	模型	评价指标
数据集	模型	Acc （↑）	Precision（↑）	Recall（↑）	F1（↑）	MSE（↓）
DSR⁃WCC	MCRF	0.96	0.94	0.93	0.93	0.01
	MCRF （w/oDA）	0.95	0.92	0.90	0.91	0.01
	GLM-4	0.22	0.61	0.21	0.15	0.17
	GPT-3.5	0.28	0.55	0.28	0.25	0.15
	GPT-4o	0.38	0.62	0.37	0.37	0.12
DSP	MCRF	0.90	0.89	0.89	0.90	0.02
	MCRF （w/oDA）	0.88	0.85	0.88	0.90	0.04
	GLM-4	0.25	0.56	0.21	0.18	0.18
	GPT-3.5	0.28	0.58	0.28	0.25	0.15
	GPT-4o	0.40	0.61	0.39	0.40	0.11

数据集	模型	评价指标
数据集	模型	Acc（↑）	Precision（↑）	平均时间/s（↓）
DSR⁃WCC	MCRF	0.96	0.94	37.8
DSR⁃WCC	w/o RC	0.97	0.96	1 371.6
DSP	MCRF	0.90	0.89	41.2
DSP	w/o RC	0.92	0.89	1 486.2

数据集	模型	评价指标
数据集	模型	Acc	Precision	Recall	F1	MSE
DSR⁃WCC	MCRF	0.96	0.94	0.93	0.93	0.01
DSR⁃WCC	w/o IoM	0.92	0.91	0.90	0.91	0.01
DSP	MCRF	0.90	0.89	0.89	0.90	0.02
DSP	w/o IoM	0.86	0.80	0.82	0.86	0.08

数据集	模型	关键词数	评价指标
数据集	模型	关键词数	Acc	Precision	F1
DSR⁃WCC	MCRF	—	0.96	0.94	0.93
	RM	100	0.10	0.09	0.11
	RM	1 000	0.20	0.20	0.18
DSP	MCRF	—	0.90	0.89	0.90
	RM	100	0.13	0.12	0.12
	RM	1 000	0.22	0.21	0.20

[1]	谢欣冉, 崔喆, 陈睿, 彭泰来, 林德坤. 基于层次过滤与标签语义扩展的大模型零样本重排序方法[J]. 《计算机应用》唯一官方网站, 2026, 46(1): 60-68.
[2]	林怡, 夏冰, 王永, 孟顺达, 刘居宠, 张书钦. 基于AI智能体的隐藏RESTful API识别与漏洞检测方法[J]. 《计算机应用》唯一官方网站, 2026, 46(1): 135-143.
[3]	张滨滨, 秦永彬, 黄瑞章, 陈艳平. 结合大语言模型与动态提示的裁判文书摘要方法[J]. 《计算机应用》唯一官方网站, 2025, 45(9): 2783-2789.
[4]	冯涛, 刘晨. 自动化偏好对齐的双阶段提示调优方法[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2442-2447.
[5]	孙熠衡, 刘茂福. 基于知识提示微调的标书信息抽取方法[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1169-1176.
[6]	徐月梅, 叶宇齐, 何雪怡. 大语言模型的偏见挑战：识别、评估与去除[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 697-708.
[7]	杨燕, 叶枫, 许栋, 张雪洁, 徐津. 融合大语言模型和提示学习的数字孪生水利知识图谱构建[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 785-793.
[8]	何静, 沈阳, 谢润锋. 大语言模型幻觉现象的识别与优化[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 709-714.
[9]	陈维, 施昌勇, 马传香. 基于多模态数据融合的农作物病害识别方法[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 840-848.
[10]	曹鹏, 温广琪, 杨金柱, 陈刚, 刘歆一, 季学纯. 面向测试用例生成的大模型高效微调方法[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 725-731.
[11]	盛坤, 王中卿. 基于大语言模型和数据增强的通感隐喻分析[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 794-800.
[12]	秦小林, 古徐, 李弟诚, 徐海文. 大语言模型综述与展望[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 685-696.
[13]	袁成哲, 陈国华, 李丁丁, 朱源, 林荣华, 钟昊, 汤庸. ScholatGPT：面向学术社交网络的大语言模型及智能应用[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 755-764.
[14]	鲁超峰, 陶冶, 文连庆, 孟菲, 秦修功, 杜永杰, 田云龙. 融合大语言模型和预训练模型的少量语料说话人-情感语音转换方法[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 815-822.
[15]	孙晨伟, 侯俊利, 刘祥根, 吕建成. 面向工程图纸理解的大语言模型提示生成方法[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 801-807.