基于多视角信息增强和层次化权重的关键短语抽取模型

doi:10.11772/j.issn.1001-9081.2025040517

《计算机应用》唯一官方网站 ›› 2026, Vol. 46 ›› Issue (4): 1050-1057.DOI: 10.11772/j.issn.1001-9081.2025040517

基于多视角信息增强和层次化权重的关键短语抽取模型

胡婕¹^,²^,³, 李鹏程¹, 孙军¹^,²^,³(), 张佳傲¹

^1.湖北大学计算机学院，武汉 430062
^2.大数据智能分析与行业应用湖北省重点实验室（湖北大学），武汉 430062
^3.智慧政务与人工智能应用湖北省工程研究中心（湖北大学），武汉 430062

收稿日期:2025-05-12 修回日期:2025-07-15 接受日期:2025-07-17 发布日期:2025-07-22 出版日期:2026-04-10
通讯作者: 孙军
作者简介:胡婕（1977—），女，湖北汉川人，教授，博士，主要研究方向：复杂语义大数据管理、自然语言处理
李鹏程（2000—），男，湖北枣阳人，硕士研究生，主要研究方向：自然语言处理
张佳傲（2001—），男，河南漯河人，硕士研究生，主要研究方向：自然语言处理。
基金资助:
国家自然科学基金资助项目(61977021)

Key phrase extraction model based on multi-perspective information enhancement and hierarchical weighting

Jie HU¹^,²^,³, Pengcheng LI¹, Jun SUN¹^,²^,³(), Jiaao ZHANG¹

^1.School of Computer Science，Hubei University，Wuhan Hubei 430062，China
^2.Hubei Key Laboratory of Big Data Intelligent Analysis and Application （Hubei University），Wuhan Hubei 430062，China
^3.Engineering Research Center of Hubei Province in Intelligent Government Affairs and Application of Artificial Intelligence，Wuhan Hubei 430062，China

Received:2025-05-12 Revised:2025-07-15 Accepted:2025-07-17 Online:2025-07-22 Published:2026-04-10
Contact: Jun SUN
About author:HU Jie， born in 1977， Ph. D.， professor. Her research interests include complex semantic big data management， natural language processing.
LI Pengcheng， born in 2000， M. S. candidate. His research interests include natural language processing.
ZHANG Jiaao， born in 2001， M. S. candidate. His research interests include natural language processing.
Supported by:
National Natural Science Foundation of China(61977021)

摘要/Abstract

摘要：

现有的无监督关键短语提取模型对复杂的上下文和多层次语义信息的捕获能力不足，无法获取多维度信息。因此，提出一种基于多视角信息增强和层次化权重的关键短语提取模型。首先，利用BERT（Bidirectional Encoder Representations from Transformers）预训练模型对文本和候选短语进行编码，获得嵌入表示；并且，通过加权平均池化优化文本嵌入，并计算它们与候选短语的全局相似度，以实现全局信息增强，提升对语义关联的理解。其次，提出基于图结构的边界感知局部中心性计算方法，以增强局部信息获取能力。最后，融合多因素计算权重，从多个维度评估候选短语的重要性。在Inspec、SemEval 2017和SemEval-2010等6个公开数据集上的实验结果表明，与基线模型PromptRank相比，所提模型的F1@5值提高了0.87~2.68个百分点，F1@10值提高了1.11~2.24个百分点，F1@15值提高了0.54~2.25个百分点。可见，所提模型的综合性能得到了有效提升。

关键词: 无监督学习, 关键短语提取, 信息增强, BERT模型, 层次化权重

Abstract:

The existing unsupervised key phrase extraction models have insufficient capability to capture complex contexts and multi-level semantic information， thereby failing to acquire multi-dimensional information. Therefore， a key phrase extraction model based on multi-perspective information enhancement and hierarchical weighting was proposed. Firstly， the BERT （Bidirectional Encoder Representations from Transformers） pre-trained model was employed to encode the text and candidate phrases， thereby obtaining the embedding representations. Besides， the text embeddings were optimized through weighted average pooling， and the global similarity between them and candidate phrases was calculated to achieve global information enhancement， thereby improving the understanding of semantic associations. Secondly， a graph structure-based boundary-aware local centrality calculation method was introduced to improve the ability to capture local information. Finally， multiple factors were integrated for weight calculation to evaluate the importance of candidate phrases from various dimensions. Experiments were conducted on six public datasets such as Inspec， SemEval 2017， and SemEval-2010. The results show that compared to the baseline model PromptRank， the proposed model achieves improvements in F1@5 score by 0.87 to 2.68 percentage points， has the F1@10 score increased by 1.11 to 2.24 percentage points， and the F1@15 score improved by 0.54 to 2.25 percentage points. It can be seen that the overall performance of the proposed model has been enhanced effectively.

Key words: unsupervised learning, key phrase extraction, information enhancement, BERT (Bidirectional Encoder Representations from Transformers) model, hierarchical weighting

中图分类号:

TP391.1

胡婕, 李鹏程, 孙军, 张佳傲. 基于多视角信息增强和层次化权重的关键短语抽取模型[J]. 计算机应用, 2026, 46(4): 1050-1057.

Jie HU, Pengcheng LI, Jun SUN, Jiaao ZHANG. Key phrase extraction model based on multi-perspective information enhancement and hierarchical weighting[J]. Journal of Computer Applications, 2026, 46(4): 1050-1057.

图/表 9

图1 本文模型的结构

Fig. 1 Structure of proposed model

表1 数据集详情

Tab. 1 Dataset details

数据集	文档数	文档平均长度	候选关键短语总数	黄金关键短语总数
Inspec	500	122	15 841	4 912
SemEval 2017	493	170	21 264	8 387
SemEval-2010	243	190	4 355	1 506
DUC2001	308	725	35 926	2 479
NUS	211	7 702	25 494	2 453
Krapivin	460	8 545	55 875	2 641

表2 模型参数

Tab. 2 Model parameters

参数	值	参数	值
BERT层数	12	b2	512
T5层数	24	隐藏层大小	768
b1	512	输入最大长度	512

表3 各模型在6个数据集上的评价指标对比 (%)

Tab. 3 Evaluation index comparison of different models on six datasets

评价指标	模型	Inspec	SemEval 2017	SemEval-2010	DUC2001	NUS	Krapivin
F1@5	TextRank	21.58（-12.31）	16.43（-13.39）	7.42（-11.92）	11.02（-17.81）	1.80（-16.63）	6.04（-10.94）
	SingleRank	14.88（-19.01）	18.23（-11.59）	8.69（-10.65）	19.14（-9.69）	2.98（-15.45）	8.12（-8.86）
	TopicRank	12.20（-21.69）	17.10（-12.72）	9.93（-9.41）	19.97（-8.86）	4.54（-13.89）	8.94（-8.04）
	MultipartiteRank	13.41（-20.48）	17.39（-12.43）	10.13（-9.21）	21.70（-7.13）	6.17（-12.26）	9.29（-7.69）
	YAKE	8.02（-25.87）	11.84（-17.98）	6.82（-12.52）	11.99（-16.84）	7.85（-10.58）	8.09（-8.89）
	EmbedRank（BERT）	28.92（-4.97）	20.03（-9.79）	10.46（-8.88）	8.12（-20.71）	3.75（-14.68）	4.05（-12.93）
	SIFRank（ELMo）	29.38（-4.51）	22.38（-7.44）	11.16（-8.18）	24.30（-4.53）	3.01（-15.42）	1.62（-15.36）
	MDERank（BERT）	26.17（-7.72）	22.81（-7.01）	12.95（-6.39）	13.05（-15.78）	15.24（-3.19）	11.78（-5.20）
	PromptRank	31.73（-2.16）	27.14（-2.68）	17.24（-2.10）	27.39（-1.44）	17.24（-1.19）	16.11（-0.87）
	MICRank	30.06（-3.83）	23.01（-6.84）	15.52（-3.83）	26.24（-2.59）	15.76（-2.67）	14.01（-2.97）
	Attention-Seeker	35.49（+1.60）	25.40（-4.42）	19.00（-0.34）	—	—	20.79（+3.81）
	本文模型	33.89	29.82	19.34	28.83	18.43	16.98
F1@10	TextRank	27.53（-12.59）	25.83（-13.61）	11.27（-11.34）	17.45（-15.87）	3.02（-18.41）	9.43（-8.39）
	SingleRank	21.50（-18.62）	27.73（-11.71）	12.94（-9.67）	23.86（-9.46）	4.51（-16.92）	10.53（-7.29）
	TopicRank	17.24（-22.88）	22.62（-16.82）	12.52（-10.09）	21.73（-11.59）	7.93（-13.50）	9.01（-8.81）
	MultipartiteRank	18.18（-21.94）	23.73（-15.71）	12.91（-9.70）	24.10（-9.22）	8.57（-12.86）	9.35（-8.47）
	YAKE	11.47（-28.65）	18.14（-21.30）	11.01（-11.60）	14.18（-19.14）	11.05（-10.38）	9.35（-8.47）
	EmbedRank（BERT）	38.55（-1.57）	31.01（-8.43）	16.35（-6.26）	11.62（-21.70）	6.34（-15.09）	6.60（-11.22）
	SIFRank（ELMo）	39.12（-1.00）	32.60（-6.84）	16.03（-6.58）	27.60（-5.72）	5.34（-16.09）	2.52（-15.30）
	MDERank（BERT）	33.81（-6.31）	32.51（-6.93）	17.07（-5.54）	17.31（-16.01）	18.33（-3.10）	12.93（-4.89）
	PromptRank	37.88（-2.24）	37.76（-1.68）	20.66（-1.95）	31.59（-1.73）	20.13（-1.30）	16.71（-1.11）
	MICRank	39.05（-1.07）	33.58（-5.86）	20.34（-2.27）	29.73（-3.59）	18.95（-2.93）	16.00（-1.82）
	Attention-Seeker	40.14（+0.02）	34.35（-4.91）	23.07（+0.46）	—	—	18.25（+0.43）
	本文模型	40.12	39.44	22.61	33.32	21.43	17.82
F1@15	TextRank	27.62（-12.80）	30.50（-12.61）	13.47（-9.67）	18.84（-13.67）	3.53（-17.23）	9.95（-6.99）
	SingleRank	24.13（-16.29）	31.73（-11.38）	14.4（-8.74）	23.43（-9.08）	4.92（-15.84）	10.42（-6.52）
	TopicRank	19.33（-21.09）	24.87（-18.24）	12.26（-10.88）	20.97（-11.54）	9.37（-11.39）	8.30（-8.64）
	MultipartiteRank	20.52（-19.90）	26.87（-16.24）	13.24（-9.90）	23.62（-8.89）	10.82（-9.94）	9.16（-7.78）
	YAKE	13.65（-26.77）	20.55（-22.56）	12.55（-10.59）	14.28（-18.23）	13.09（-7.67）	9.12（-7.82）
	EmbedRank（BERT）	39.77（-0.65）	36.72（-6.39）	19.35（-3.79）	13.58（-18.93）	8.11（-12.65）	7.84（-9.10）
	SIFRank（ELMo）	39.82（-0.60）	37.25（-5.86）	18.42（-4.72）	27.96（-4.55）	5.86（-14.90）	3.00（-13.94）
	MDERank（BERT）	36.17（-4.25）	37.18（-5.93）	20.09（-3.05）	19.13（-13.38）	17.95（-2.81）	12.58（-4.36）
	PromptRank	38.17（-2.25）	41.57（-1.54）	21.35（-1.79）	31.01（-1.50）	20.12（-0.64）	16.02（-0.92）
	MICRank	40.20（-0.22）	38.36（-3.75）	21.58（-1.56）	30.31（-3.04）	19.46（-1.30）	15.54（-1.40）
	Attention-Seeker	39.22（-1.20）	38.50（-3.61）	23.81（+0.67）	—	—	16.22（-0.72）
	本文模型	40.42	42.11	23.14	33.51	20.76	16.94

表3 各模型在6个数据集上的评价指标对比 (%)

Tab. 3 Evaluation index comparison of different models on six datasets

评价指标	模型	Inspec	SemEval 2017	SemEval-2010	DUC2001	NUS	Krapivin
F1@5	TextRank	21.58（-12.31）	16.43（-13.39）	7.42（-11.92）	11.02（-17.81）	1.80（-16.63）	6.04（-10.94）
	SingleRank	14.88（-19.01）	18.23（-11.59）	8.69（-10.65）	19.14（-9.69）	2.98（-15.45）	8.12（-8.86）
	TopicRank	12.20（-21.69）	17.10（-12.72）	9.93（-9.41）	19.97（-8.86）	4.54（-13.89）	8.94（-8.04）
	MultipartiteRank	13.41（-20.48）	17.39（-12.43）	10.13（-9.21）	21.70（-7.13）	6.17（-12.26）	9.29（-7.69）
	YAKE	8.02（-25.87）	11.84（-17.98）	6.82（-12.52）	11.99（-16.84）	7.85（-10.58）	8.09（-8.89）
	EmbedRank（BERT）	28.92（-4.97）	20.03（-9.79）	10.46（-8.88）	8.12（-20.71）	3.75（-14.68）	4.05（-12.93）
	SIFRank（ELMo）	29.38（-4.51）	22.38（-7.44）	11.16（-8.18）	24.30（-4.53）	3.01（-15.42）	1.62（-15.36）
	MDERank（BERT）	26.17（-7.72）	22.81（-7.01）	12.95（-6.39）	13.05（-15.78）	15.24（-3.19）	11.78（-5.20）
	PromptRank	31.73（-2.16）	27.14（-2.68）	17.24（-2.10）	27.39（-1.44）	17.24（-1.19）	16.11（-0.87）
	MICRank	30.06（-3.83）	23.01（-6.84）	15.52（-3.83）	26.24（-2.59）	15.76（-2.67）	14.01（-2.97）
	Attention-Seeker	35.49（+1.60）	25.40（-4.42）	19.00（-0.34）	—	—	20.79（+3.81）
	本文模型	33.89	29.82	19.34	28.83	18.43	16.98
F1@10	TextRank	27.53（-12.59）	25.83（-13.61）	11.27（-11.34）	17.45（-15.87）	3.02（-18.41）	9.43（-8.39）
	SingleRank	21.50（-18.62）	27.73（-11.71）	12.94（-9.67）	23.86（-9.46）	4.51（-16.92）	10.53（-7.29）
	TopicRank	17.24（-22.88）	22.62（-16.82）	12.52（-10.09）	21.73（-11.59）	7.93（-13.50）	9.01（-8.81）
	MultipartiteRank	18.18（-21.94）	23.73（-15.71）	12.91（-9.70）	24.10（-9.22）	8.57（-12.86）	9.35（-8.47）
	YAKE	11.47（-28.65）	18.14（-21.30）	11.01（-11.60）	14.18（-19.14）	11.05（-10.38）	9.35（-8.47）
	EmbedRank（BERT）	38.55（-1.57）	31.01（-8.43）	16.35（-6.26）	11.62（-21.70）	6.34（-15.09）	6.60（-11.22）
	SIFRank（ELMo）	39.12（-1.00）	32.60（-6.84）	16.03（-6.58）	27.60（-5.72）	5.34（-16.09）	2.52（-15.30）
	MDERank（BERT）	33.81（-6.31）	32.51（-6.93）	17.07（-5.54）	17.31（-16.01）	18.33（-3.10）	12.93（-4.89）
	PromptRank	37.88（-2.24）	37.76（-1.68）	20.66（-1.95）	31.59（-1.73）	20.13（-1.30）	16.71（-1.11）
	MICRank	39.05（-1.07）	33.58（-5.86）	20.34（-2.27）	29.73（-3.59）	18.95（-2.93）	16.00（-1.82）
	Attention-Seeker	40.14（+0.02）	34.35（-4.91）	23.07（+0.46）	—	—	18.25（+0.43）
	本文模型	40.12	39.44	22.61	33.32	21.43	17.82
F1@15	TextRank	27.62（-12.80）	30.50（-12.61）	13.47（-9.67）	18.84（-13.67）	3.53（-17.23）	9.95（-6.99）
	SingleRank	24.13（-16.29）	31.73（-11.38）	14.4（-8.74）	23.43（-9.08）	4.92（-15.84）	10.42（-6.52）
	TopicRank	19.33（-21.09）	24.87（-18.24）	12.26（-10.88）	20.97（-11.54）	9.37（-11.39）	8.30（-8.64）
	MultipartiteRank	20.52（-19.90）	26.87（-16.24）	13.24（-9.90）	23.62（-8.89）	10.82（-9.94）	9.16（-7.78）
	YAKE	13.65（-26.77）	20.55（-22.56）	12.55（-10.59）	14.28（-18.23）	13.09（-7.67）	9.12（-7.82）
	EmbedRank（BERT）	39.77（-0.65）	36.72（-6.39）	19.35（-3.79）	13.58（-18.93）	8.11（-12.65）	7.84（-9.10）
	SIFRank（ELMo）	39.82（-0.60）	37.25（-5.86）	18.42（-4.72）	27.96（-4.55）	5.86（-14.90）	3.00（-13.94）
	MDERank（BERT）	36.17（-4.25）	37.18（-5.93）	20.09（-3.05）	19.13（-13.38）	17.95（-2.81）	12.58（-4.36）
	PromptRank	38.17（-2.25）	41.57（-1.54）	21.35（-1.79）	31.01（-1.50）	20.12（-0.64）	16.02（-0.92）
	MICRank	40.20（-0.22）	38.36（-3.75）	21.58（-1.56）	30.31（-3.04）	19.46（-1.30）	15.54（-1.40）
	Attention-Seeker	39.22（-1.20）	38.50（-3.61）	23.81（+0.67）	—	—	16.22（-0.72）
	本文模型	40.42	42.11	23.14	33.51	20.76	16.94

表4 消融实验结果 (%)

Tab. 4 Ablation experimental results

模型	评价指标	Inspec	SemEval 2017	SemEval-2010	DUC2001	NUS	Krapivin
本文模型	F1@5	33.89	29.82	19.34	28.83	18.43	16.98
	F1@10	40.12	39.44	22.61	33.32	21.43	17.82
	F1@15	40.42	42.11	23.14	33.51	20.76	16.94
去除T5模型	F1@5	27.35	22.39	13.28	24.58	14.23	13.14
	F1@10	36.13	32.18	17.69	27.64	17.36	15.62
	F1@15	37.32	38.93	18.23	27.86	17.03	15.41
去除全局信息增强	F1@5	31.74	26.32	17.36	28.35	17.23	15.63
	F1@10	39.25	37.22	20.03	32.16	20.09	16.86
	F1@15	39.43	41.06	22.09	32.13	20.02	16.53
去除局部信息增强	F1@5	29.43	24.58	15.43	27.54	16.38	15.32
	F1@10	38.15	35.81	19.05	30.45	19.49	16.82
	F1@15	39.39	40.64	20.42	30.86	19.34	14.52
去除层次化权重	F1@5	30.25	24.69	16.35	27.32	15.96	15.21
	F1@10	38.69	36.24	19.85	30.02	19.73	16.52
	F1@15	38.92	41.75	20.32	30.51	19.67	16.32

表5 α超参数的实验结果

Tab. 5 Experimental results of hyperparameter α

$α$	F1@10/%
$α$	Inspec	SemEval 2017	SemEval-2010	DUC2001	NUS	Krapivin
0.5	38.90	38.44	21.11	31.32	20.43	16.82
0.8	40.12	39.44	22.61	33.32	20.53	16.72
1.0	40.01	38.84	21.61	32.99	20.93	17.62
1.2	40.02	39.04	22.01	32.32	21.43	17.82
1.5	39.91	38.94	22.21	33.02	21.03	17.72

表5 α超参数的实验结果

Tab. 5 Experimental results of hyperparameter α

$α$	F1@10/%
$α$	Inspec	SemEval 2017	SemEval-2010	DUC2001	NUS	Krapivin
0.5	38.90	38.44	21.11	31.32	20.43	16.82
0.8	40.12	39.44	22.61	33.32	20.53	16.72
1.0	40.01	38.84	21.61	32.99	20.93	17.62
1.2	40.02	39.04	22.01	32.32	21.43	17.82
1.5	39.91	38.94	22.21	33.02	21.03	17.72

表6 β超参数的实验结果

Tab. 6 Experimental results of hyperparameter β

$β$	F1@10/%
$β$	Inspec	SemEval 2017	SemEval-2010	DUC2001	NUS	Krapivin
0.10	38.90	38.44	21.11	31.32	20.43	16.82
0.15	39.41	38.34	21.43	33.12	20.53	17.82
0.20	40.12	39.44	22.61	33.32	21.43	17.82
0.25	40.02	39.04	22.01	32.32	21.33	17.62
0.30	39.91	38.94	22.21	33.02	21.03	17.82

表6 β超参数的实验结果

Tab. 6 Experimental results of hyperparameter β

$β$	F1@10/%
$β$	Inspec	SemEval 2017	SemEval-2010	DUC2001	NUS	Krapivin
0.10	38.90	38.44	21.11	31.32	20.43	16.82
0.15	39.41	38.34	21.43	33.12	20.53	17.82
0.20	40.12	39.44	22.61	33.32	21.43	17.82
0.25	40.02	39.04	22.01	32.32	21.33	17.62
0.30	39.91	38.94	22.21	33.02	21.03	17.82

表7 λ超参数的实验结果

Tab. 7 Experimental results of hyperparameter λ

$λ$	F1@10/%
$λ$	Inspec	SemEval 2017	SemEval-2010	DUC2001	NUS	Krapivin
0.7	38.90	38.44	21.11	31.32	20.43	16.82
0.8	40.02	38.14	22.41	32.32	21.43	16.72
0.9	40.12	39.44	22.61	33.32	20.93	17.72
1.0	40.02	39.04	22.01	32.32	20.83	17.02

表7 λ超参数的实验结果

Tab. 7 Experimental results of hyperparameter λ

$λ$	F1@10/%
$λ$	Inspec	SemEval 2017	SemEval-2010	DUC2001	NUS	Krapivin
0.7	38.90	38.44	21.11	31.32	20.43	16.82
0.8	40.02	38.14	22.41	32.32	21.43	16.72
0.9	40.12	39.44	22.61	33.32	20.93	17.72
1.0	40.02	39.04	22.01	32.32	20.83	17.02

表8 全局信息增强池化的实验结果 (%)

Tab. 8 Experimental results of global information enhancement pooling

模型	F1@10
模型	Inspec	DUC2001	NUS
本文模型	40.12	33.32	21.43
AvgPool	39.72	32.81	21.21
MaxPool	39.26	32.27	20.62

参考文献 27

[1]	胡少虎，张颖怡，章成志. 关键词提取研究综述［J］. 数据分析与知识发现， 2021， 5（3）： 45-59.
	HU S H， ZHANG Y Y， ZHANG C Z. Review of keyword extraction studies［J］. Data Analysis and Knowledge Discovery， 2021， 5（3）： 45-59.
[2]	罗燕，赵书良，李晓超，等. 基于词频统计的文本关键词提取方法［J］. 计算机应用， 2016， 36（3）： 718-725.
	LUO Y， ZHAO S L， LI X C， et al. Text keyword extraction method based on word frequency statistics［J］. Journal of Computer Applications， 2016， 36（3）： 718-725.
[3]	ROELLEKE T， WANG J. TF-IDF uncovered： a study of theories and probabilities［C］// Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York： ACM， 2008： 435-442.
[4]	CAMPOS R， MANGARAVITE V， PASQUALI A， et al. YAKE！ collection-independent automatic keyword extractor［C］// Proceedings of the 2018 European Conference on Information Retrieval， LNCS 10772. Cham： Springer， 2018： 806-810.
[5]	MIHALCEA R， TARAU P. TextRank： bringing order into text［C］// Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing. Stroudsburg： ACL， 2004： 404-411.
[6]	WAN X， XIAO J. CollabRank： towards a collaborative approach to single-document keyphrase extraction［C］// Proceedings of the 22nd International Conference on Computational Linguistics. ［S.l.］： Coling 2008 Organizing Committee， 2008： 969-976.
[7]	BOUGOUIN A， BOUDIN F， DAILLE B. TopicRank： graph-based topic ranking for keyphrase extraction［C］// Proceedings of the 6th International Joint Conference on Natural Language Processing. ［S.l.］： Asian Federation of Natural Language Processing， 2013： 543-551.
[8]	BOUDIN F. Unsupervised keyphrase extraction with multipartite graphs［C］// Proceedings of the 2018 Conference on the North American Chapter of the Association for Computational Linguistics： Human Language Technologies， Volume 2 （Short Papers）. Stroudsburg： ACL， 2018： 667-672.
[9]	WEN Y， YUAN H， ZHANG P. Research on keyword extraction based on Word2Vec weighted TextRank［C］// Proceedings of the 2nd IEEE International Conference on Computer and Communications. Piscataway： IEEE， 2016： 2109-2113.
[10]	FLORESCU C， CARAGEA C. PositionRank： an unsupervised approach to keyphrase extraction from scholarly documents［C］// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics （Volume 1： Long Papers）. Stroudsburg： ACL， 2017： 1105-1115.
[11]	WANG R， LIU W， McDONALD C. Using word embeddings to enhance keyword identification for scientific publications［C］// Proceedings of the 2015 Australasian Database Conference， LNCS 9093. Cham： Springer， 2015： 257-268.
[12]	MAHATA D， KURIAKOSE J， SHAH R， et al. Key2Vec： automatic ranked keyphrase extraction from scientific articles using phrase embeddings［C］// Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics： Human Language Technologies， Volume 2 （Short Papers）. Stroudsburg： ACL， 2018： 634-639.
[13]	BENNANI-SMIRES K， MUSAT C， HOSSMANN A， et al. Simple unsupervised keyphrase extraction using sentence embeddings［C］// Proceedings of the 22nd Conference on Computational Natural Language Learning. Stroudsburg： ACL， 2018： 221-229.
[14]	SUN Y， QIU H， ZHENG Y， et al. SIFRank： a new baseline for unsupervised keyphrase extraction based on pre-trained language model［J］. IEEE Access， 2020， 8： 10896-10906.
[15]	SARZYNSKA-WAWER J， WAWER A， PAWLAK A， et al. Detecting formal thought disorder by deep contextualized word representations［J］. Psychiatry Research， 2021， 304： No.114135.
[16]	DING H， LUO X. AttentionRank： unsupervised keyphrase extraction using self and cross attentions［C］// Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Stroudsburg： ACL， 2021： 1919-1928.
[17]	ZHANG L， CHEN Q， WANG W， et al. MDERank： a masked document embedding rank approach for unsupervised keyphrase extraction［C］// Findings of the Association for Computational Linguistics： ACL 2022. Stroudsburg： ACL， 2022： 396-409.
[18]	KONG A， ZHAO S， CHEN H， et al. PromptRank： unsupervised keyphrase extraction using prompt［C］// Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics. Stroudsburg： ACL， 2023： 9788-9801.
[19]	BAI R， LIU F， ZHUANG X， et al. MICRank： multi-information interconstrained keyphrase extraction［J］. Expert Systems with Applications， 2024， 249（Pt C）： No.123744.
[20]	LÓPEZ ZAPATA E D， TANG C， SHIMADA A. Attention-Seeker： dynamic self-attention scoring for unsupervised keyphrase extraction［C］// Proceedings of the 31st International Conference on Computational Linguistics. Stroudsburg： ACL， 2025： 5011-5026.
[21]	NAZARUKA E， OSIS J， GRIBERMAN V. Using Stanford CoreNLP capabilities for semantic information extraction from textual descriptions［C］// Proceedings of the 2019 International Conference on Evaluation of Novel Approaches to Software Engineering， CCIS 1172. Cham： Springer， 2020： 1-21.
[22]	HULTH A. Improved automatic keyword extraction given more linguistic knowledge［C］// Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing. Stroudsburg： ACL， 2003： 216-223.
[23]	KIM S N， MEDELYAN O， KAN M Y， et al. SemEval-2010 task 5： automatic keyphrase extraction from scientific［C］// Proceedings of the 5th International Workshop on Semantic Evaluation. Stroudsburg： ACL， 2010： 21-26.
[24]	AUGENSTEIN I， DAS M， RIEDEL S， et al. SemEval 2017 task 10： ScienceIE-extracting keyphrases and relations from scientific publications［C］// Proceedings of the 11th International Workshop on Semantic Evaluation. Stroudsburg： ACL， 2017： 546-555.
[25]	NGUYEN T D， KAN M Y. Keyphrase extraction in scientific publications［C］// Proceedings of the 2007 International Conference on Asian Digital Libraries， LNCS 4822. Berlin： Springer， 2007： 317-326.
[26]	WAN X， XIAO J. Single document keyphrase extraction using neighborhood knowledge［C］// Proceedings of the 23rd Conference on Artificial Intelligence. Palo Alto： AAAI Press， 2008： 855-860.
[27]	KRAPIVIN M， AUTAEU A， MARCHESE M. Large dataset for keyphrases extraction［R］. Trento： University of Trento， 2009.

基于多视角信息增强和层次化权重的关键短语抽取模型

Key phrase extraction model based on multi-perspective information enhancement and hierarchical weighting

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 9

参考文献 27

相关文章 15

编辑推荐

Metrics

[1]	索逸凡, 刘松华, 郝秋智. 基于高阶特征聚合的时间序列异常检测方法[J]. 《计算机应用》唯一官方网站, 2026, 46(4): 1131-1138.
[2]	尹春勇, 张不凡. 基于多尺度的多变量时间序列异常检测模型[J]. 《计算机应用》唯一官方网站, 2026, 46(3): 790-797.
[3]	董莉梅, 李雁姿, 李家印, 许力. 基于邻域增强的无监督图异常检测[J]. 《计算机应用》唯一官方网站, 2026, 46(2): 458-466.
[4]	王艺涵, 路翀, 陈忠源. 跨模态文本信息增强的多模态情感分析模型[J]. 《计算机应用》唯一官方网站, 2025, 45(7): 2237-2244.
[5]	黄颖, 高胜美, 陈广, 刘苏. 结合信噪比引导的双分支结构和直方图均衡的低照度图像增强网络[J]. 《计算机应用》唯一官方网站, 2025, 45(6): 1971-1979.
[6]	何玉林, 李旭, 贺颖婷, 崔来中, 黄哲学. 基于最大均值差异的子空间高斯混合模型聚类集成算法[J]. 《计算机应用》唯一官方网站, 2025, 45(6): 1712-1723.
[7]	杨雅莉, 黎英, 章育涛, 宋佩华. 面向人脸识别的多模态研究方法综述[J]. 《计算机应用》唯一官方网站, 2025, 45(5): 1645-1657.
[8]	王文鹏, 秦寅畅, 师文轩. 工业缺陷检测无监督深度学习方法综述[J]. 《计算机应用》唯一官方网站, 2025, 45(5): 1658-1670.
[9]	杨定木, 倪龙强, 梁晶, 邱照原, 张永真, 齐志强. 基于语义相似度的协议转换方法[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1263-1270.
[10]	潘理虎, 彭守信, 张睿, 薛之洋, 毛旭珍. 面向运动前景区域的视频异常检测[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1300-1309.
[11]	安俊秀, 杨林旺, 柳源. 基于邻近性语义感知的无监督文本风格迁移[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1139-1147.
[12]	李林昊, 韩冬, 董永峰, 李英双, 王振. 基于关联信息增强与关系平衡的场景图生成方法[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 953-962.
[13]	李雨轩, 陈斌, 咸伟志. 基于去噪反向蒸馏的无监督工业异常检测[J]. 《计算机应用》唯一官方网站, 2025, 45(11): 3721-3729.
[14]	蒋章涛, 李欣, 张士豪, 赵心阳. 融合BERT与X-means算法的微博舆情热度分析预测模型[J]. 《计算机应用》唯一官方网站, 2025, 45(10): 3138-3145.
[15]	王一铭, 李世源, 廖南清, 陈庆锋. 基于证据深度学习的不确定性感知无监督医学图像配准模型[J]. 《计算机应用》唯一官方网站, 2025, 45(10): 3371-3380.