Topic-prior-guided dual-context entity alignment model

doi:10.11772/j.issn.1001-9081.2025121526

Journal of Computer Applications

Received:2025-12-22 Revised:2026-03-03 Online:2026-03-20 Published:2026-03-20

主题先验引导的双上下文实体对齐模型

翟社平,杨乐童,刘雪,杨锐

西安邮电大学

通讯作者: 杨乐童
基金资助:
国家自然科学基金资助项目;国家级大学生创新创业计划训练项目;陕西省大学生创新创业训练计划项目;陕西省重点研发计划项目;陕西省重点研发计划项目;陕西省教育厅科学研究计划项目;陕西省社会科学基金资助项目;工业和信息化部通信软科学项目;西安市社会科学规划基金资助项目

Abstract

Abstract: Entity alignment(EA) in multi-source heterogeneous knowledge graphs(KG) is crucial for identifying semantically equivalent entities across graphs. To address the difficulty of separating hard negative samples—entities that are highly similar in semantics but not equivalent—and the matching errors caused by relying only on local structure or textual similarity, a Topic-Prior-Guided Dual-Context Entity Alignment (TPDC) model was proposed. A topic model was built from entity attribute texts to derive entity-level topic distributions, which served as topic priors to guide candidate pool construction and difficulty stratification, thus constraining the search space to a semantically concentrated subspace. A dual-context encoding network with neighbor and relation branches was designed to capture fine-grained structural semantics from multi-hop neighbors and relation paths. A curriculum contrastive learning strategy was introduced to increase the sampling ratio and loss weights of hard negatives in an easy-to-hard manner, improving discrimination in late training. Results on DBP15K show that Hits at 1 (Hits@1) increases by 0.04 and 5.18 percentage points over the second-best baseline on two subsets, Hits at 10 (Hits@10) increases by 0.32,1.13, and 0.05 percentage points, and Mean Reciprocal Rank (MRR) increases by 0.026,0.106,and 0.053,confirming better overall ranking quality and robust handling of hard negatives.

Key words: Knowledge Graphs(KG), Entity Alignment(EA), Topic Prior, Path-Enhanced Dual-Context, Curriculum Contrastive Learning

摘要： 摘要: 多源异构知识图谱(KG)中的实体对齐(EA)是识别跨图谱语义等价实体的关键任务。针对实际场景中语义高度相似但并非等价的实体易构成典型负样本、仅依赖局部邻域结构或文本相似度易导致匹配错误且决策边界难以精确划定的问题，提出了一种主题先验引导的双上下文实体对齐模型(TPDC)。基于实体属性文本构建主题模型生成实体级主题分布，并作为主题先验指导候选池构建与负样本难度分级，从全局语义层面将对齐搜索空间约束至语义集中的候选子空间。并设计由邻居上下文与关系上下文两路组成的双上下文编码网络，联合建模多跳邻居与关系路径的细粒度结构语义。最后引入课程式对比学习策略，按先易后难逐步提升困难负样本采样比例并加大其损失权重，使模型后期更聚焦区分语义相近但不等价的困难负样本。实验结果表明，在DBP15K的三个子数据集上，Hits@1在其中两个子数据集上相较于次优基线模型，分别提升了0.04和5.18个百分点；Hits@10相较各子集的次优基线模型分别提升了0.32、1.13、0.05个百分点。此外，平均倒数排名（MRR）相较各子集的次优基线模型，分别提升0.026、0.106、0.053，进一步验证了TPDC在综合排序质量上的优势。同时证明了其在处理困难负样本方面的有效性和鲁棒性。

关键词: 知识图谱, 实体对齐, 主题先验, 路径增强双上下文, 课程式对比学习

CLC Number:

TP391.1

翟社平杨乐童刘雪杨锐. 主题先验引导的双上下文实体对齐模型[J]. 《计算机应用》唯一官方网站, DOI: 10.11772/j.issn.1001-9081.2025121526.

[1]	Ronghui ZHAO, Chao DENG, Zidong YU. Construction and application of knowledge graph for fault diagnosis of key components of aviation equipment [J]. Journal of Computer Applications, 2026, 46(5): 1604-1613.
[2]	Hao LIANG, Shaojie QIAO. Complex query-based question-answering model integrating bidirectional sequence embeddings [J]. Journal of Computer Applications, 2026, 46(4): 1096-1103.
[3]	Kaizhou SHI, Xuan HE, Guoyi HOU, Gen LI, Shuanggao LI, Xiang HUANG. Airborne product metrological traceability knowledge graph construction method based on large language models [J]. Journal of Computer Applications, 2026, 46(4): 1086-1095.
[4]	Haoyang ZHANG, Liping ZHANG, Sheng YAN, Na LI, Xuefei ZHANG. Review of large language model methods for knowledge graph completion [J]. Journal of Computer Applications, 2026, 46(3): 683-695.
[5]	Yiming HUANG, Xihua ZOU, Guo DENG, Di ZHENG. Pre-answering and retrieval filtering： dual-stage optimization method for RAG-based question-answering systems [J]. Journal of Computer Applications, 2026, 46(3): 696-707.
[6]	Xue WANG, Liping ZHANG, Sheng YAN, Na LI, Xuefei ZHANG. Review of multi-modal knowledge graph completion methods [J]. Journal of Computer Applications, 2026, 46(2): 341-353.
[7]	Jindong HE, Yuxuan JI, Tianci CHEN, Hengming XU, Ji GENG, Mingsheng CAO, Yuanning LIANG. Entity discovery method for non-intelligent sensors by integrating knowledge graph and large models [J]. Journal of Computer Applications, 2026, 46(2): 354-360.
[8]	Fei WANG, Ye TAO, Jiawang LIU, Wei LI, Xiugong QIN, Ning ZHANG. Bimodal fusion method for constructing spatio-temporal knowledge graph in smart home space [J]. Journal of Computer Applications, 2026, 46(1): 52-59.
[9]	Chao LIU, Yanhua YU. Knowledge-aware recommendation model combining denoising strategy and multi-view contrastive learning [J]. Journal of Computer Applications, 2025, 45(9): 2827-2837.
[10]	Sheping ZHAI, Yan HUANG, Qing YANG, Rui YANG. Multi-view entity alignment combining triples and text attributes [J]. Journal of Computer Applications, 2025, 45(6): 1793-1800.
[11]	Shuang LIU, Daqing LIU, Jiana MENG, Di ZHAO. Hyper-relational knowledge graph completion method fusing noise filtering [J]. Journal of Computer Applications, 2025, 45(6): 1817-1826.
[12]	Chun XU, Shuangyan JI, Huan MA, Enwei SUN, Mengmeng WANG, Mingyu SU. Consultation recommendation method based on knowledge graph and dialogue structure [J]. Journal of Computer Applications, 2025, 45(4): 1157-1168.
[13]	Sheping ZHAI, Qing YANG, Yan HUANG, Rui YANG. Knowledge graph completion using hierarchical attention fusing directed relationships and relational paths [J]. Journal of Computer Applications, 2025, 45(4): 1148-1156.
[14]	Liqin WANG, Zhilei GENG, Yingshuang LI, Yongfeng DONG, Meng BIAN. Open-world knowledge reasoning model based on path and enhanced triplet text [J]. Journal of Computer Applications, 2025, 45(4): 1177-1183.
[15]	Zixin XU, Xiuwen YI, Jie BAO, Tianrui LI, Junbo ZHANG, Yu ZHENG. Construction and application of knowledge graph for epidemiological investigation [J]. Journal of Computer Applications, 2025, 45(4): 1340-1348.