Journal of Computer Applications ›› 2025, Vol. 45 ›› Issue (10): 3195-3202.DOI: 10.11772/j.issn.1001-9081.2024091388
• Data science and technology • Previous Articles
Zeyi CAO1,2, Yan CHANG1,2,3(), Renxin LAI1,2, Shibin ZHANG1,2,3, Zhi QIN1,2,3, Lili YAN1,2,3, Xuejian ZHANG1,2, Yuanhao DI1,2
Received:
2024-10-07
Revised:
2024-12-19
Accepted:
2024-12-20
Online:
2025-03-14
Published:
2025-10-10
Contact:
Yan CHANG
About author:
CAO Zeyi, born in 1998, M. S. candidate. His research interests include data mining, data fusion, knowledge graph, entity alignment.Supported by:
曹泽毅1,2, 昌燕1,2,3(), 赖仁鑫1,2, 张仕斌1,2,3, 秦智1,2,3, 闫丽丽1,2,3, 张雪健1,2, 狄元灏1,2
通讯作者:
昌燕
作者简介:
曹泽毅(1998—),男,四川成都人,硕士研究生,主要研究方向:数据挖掘、数据融合、知识图谱、实体对齐基金资助:
CLC Number:
Zeyi CAO, Yan CHANG, Renxin LAI, Shibin ZHANG, Zhi QIN, Lili YAN, Xuejian ZHANG, Yuanhao DI. Attribute-based entity alignment algorithm for decentralized data storage in large-scale institutions[J]. Journal of Computer Applications, 2025, 45(10): 3195-3202.
曹泽毅, 昌燕, 赖仁鑫, 张仕斌, 秦智, 闫丽丽, 张雪健, 狄元灏. 面向大规模机构分散存储数据的基于属性的实体对齐算法[J]. 《计算机应用》唯一官方网站, 2025, 45(10): 3195-3202.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2024091388
数据源 | 实体数 | 属性种类数 | 属性三元组数 | 总属性 类型数 | 实体种子 对数 |
---|---|---|---|---|---|
数据源1 | 1 645 | 19 | 17 807 | 22 | 1 076 |
数据源2 | 1 286 | 16 | 18 207 |
Tab. 1 Detailed information of CTD
数据源 | 实体数 | 属性种类数 | 属性三元组数 | 总属性 类型数 | 实体种子 对数 |
---|---|---|---|---|---|
数据源1 | 1 645 | 19 | 17 807 | 22 | 1 076 |
数据源2 | 1 286 | 16 | 18 207 |
数据集 | 实体数 | 属性 种类数 | 属性 三元组数 | 总属性 类型数 | 实体种子 对数 |
---|---|---|---|---|---|
WIKI_MOVIE | 9 616 | 7 | 67 676 | 14 | 4 763 |
IMDB_MOVIE | 9 771 | 7 | 67 278 |
Tab. 2 Detailed information of WIKI_MOVIE and IMDB_MOVIE datasets
数据集 | 实体数 | 属性 种类数 | 属性 三元组数 | 总属性 类型数 | 实体种子 对数 |
---|---|---|---|---|---|
WIKI_MOVIE | 9 616 | 7 | 67 676 | 14 | 4 763 |
IMDB_MOVIE | 9 771 | 7 | 67 278 |
模型 | Hits@1/% | Hits@10/% | MRR | MR |
---|---|---|---|---|
BERT-INT | 29.18 | 31.43 | 0.228 | 4.39 |
MTransE | 25.20 | 33.12 | 0.318 | 3.14 |
GCN-Align | 31.23 | 41.20 | 0.332 | 3.01 |
PipEA | 74.62 | 81.12 | 0.625 | 1.61 |
RDGCN | 75.23 | 80.10 | 0.582 | 1.72 |
AttrGNN | 80.24 | 94.56 | 0.872 | 1.15 |
AutoAlign | 85.21 | 96.43 | 0.901 | 1.10 |
本文模型 | 90.45 | 99.60 | 0.937 | 1.07 |
Tab. 3 Comparison results of different models on CTD
模型 | Hits@1/% | Hits@10/% | MRR | MR |
---|---|---|---|---|
BERT-INT | 29.18 | 31.43 | 0.228 | 4.39 |
MTransE | 25.20 | 33.12 | 0.318 | 3.14 |
GCN-Align | 31.23 | 41.20 | 0.332 | 3.01 |
PipEA | 74.62 | 81.12 | 0.625 | 1.61 |
RDGCN | 75.23 | 80.10 | 0.582 | 1.72 |
AttrGNN | 80.24 | 94.56 | 0.872 | 1.15 |
AutoAlign | 85.21 | 96.43 | 0.901 | 1.10 |
本文模型 | 90.45 | 99.60 | 0.937 | 1.07 |
模型 | Hits@1/% | Hits@10/% | MRR |
---|---|---|---|
TransE | 96.36 | 97.50 | 0.983 |
TranSparse | 95.72 | 97.00 | — |
MultiKE | 95.25 | 96.50 | — |
SEEA | 96.42 | 98.00 | — |
本文模型 | 98.45 | 99.76 | 0.997 |
Tab. 4 Comparison results of different models on WIKI_MOVIE-IMDB_MOVIE dataset
模型 | Hits@1/% | Hits@10/% | MRR |
---|---|---|---|
TransE | 96.36 | 97.50 | 0.983 |
TranSparse | 95.72 | 97.00 | — |
MultiKE | 95.25 | 96.50 | — |
SEEA | 96.42 | 98.00 | — |
本文模型 | 98.45 | 99.76 | 0.997 |
组号 | 有 init_weight | 无 init_weight | ||||
---|---|---|---|---|---|---|
Hits@1/% | Hits@10/% | MRR | Hits@1/% | Hits@10/% | MRR | |
1 | 89.80 | 99.80 | 0.930 | 87.80 | 99.50 | 0.920 |
2 | 91.00 | 99.75 | 0.936 | 88.99 | 99.50 | 0.927 |
3 | 90.55 | 99.88 | 0.935 | 89.52 | 99.58 | 0.930 |
4 | 91.20 | 99.90 | 0.938 | 89.80 | 99.65 | 0.930 |
5 | 91.50 | 99.92 | 0.939 | 90.00 | 99.60 | 0.932 |
Tab. 5 Comparison results of ablation experiments
组号 | 有 init_weight | 无 init_weight | ||||
---|---|---|---|---|---|---|
Hits@1/% | Hits@10/% | MRR | Hits@1/% | Hits@10/% | MRR | |
1 | 89.80 | 99.80 | 0.930 | 87.80 | 99.50 | 0.920 |
2 | 91.00 | 99.75 | 0.936 | 88.99 | 99.50 | 0.927 |
3 | 90.55 | 99.88 | 0.935 | 89.52 | 99.58 | 0.930 |
4 | 91.20 | 99.90 | 0.938 | 89.80 | 99.65 | 0.930 |
5 | 91.50 | 99.92 | 0.939 | 90.00 | 99.60 | 0.932 |
[1] | 全国人民代表大会常务委员会. 中华人民共和国数据安全法[EB/OL]. [2024-02-12]. . |
Standing Committee of the National People’s Congress. Data security law of the People’s Republic of China[EB/OL]. [2024-02-12]. . | |
[2] | WANG Z, LV Q, LAN X, et al. Cross-lingual knowledge graph alignment via graph convolutional networks[C]// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2018: 349-357. |
[3] | WU Y, LIU X, FENG Y, et al. Relation-aware entity alignment for heterogeneous knowledge graphs[C]// Proceedings of the 28th International Joint Conference on Artificial Intelligence. California: ijcai.org, 2019: 5278-5284. |
[4] | CHURCH K W. Word2Vec[J]. Natural Language Engineering, 2017, 23(1): 155-162. |
[5] | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need [C]// Proceedings of the 31st Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2017: 6000-6010. |
[6] | TANG X, ZHANG J, CHEN B, et al. BERT-INT: a BERT-based interaction model for knowledge graph alignment[C]// Proceedings of the 29th International Joint Conference on Artificial Intelligence. California: ijcai.org, 2020: 3174-3180. |
[7] | GORENSTEIN L, KONEN E, GREEN M, et al. Bidirectional encoder representations from Transformers in radiology: a systematic review of natural language processing applications[J]. Journal of the American College of Radiology, 2024, 21(6): 914-941. |
[8] | TRISEDYA B D, QI J, ZHANG R. Entity alignment between knowledge graphs using attribute embeddings[C]// Proceedings of the 33rd AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2019: 297-304. |
[9] | LIU Z, CAO Y, PAN L, et al. Exploring and evaluating attributes, values, and structures for entity alignment[C]// Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2020: 6355-6364. |
[10] | PEI S, YU L, YU G, et al. REA: robust cross-lingual entity alignment between knowledge graphs[C]// Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2020: 2175-2184. |
[11] | MEGAHED M, MOHAMMED A. A comprehensive review of generative adversarial networks: fundamentals, applications, and challenges[J]. WIREs Computational Statistics, 2024, 16(1): No.e1629. |
[12] | 单力秋. 噪声敏感的关系感知跨语言实体对齐方法研究[D]. 阜新:辽宁工程技术大学, 2022. |
SHAN L Q. Research on noise sensitive relationship aware cross-lingual entity alignment method[D]. Fuxin: Liaoning Technical University, 2022. | |
[13] | RAOUFI E, HAPPI B G H, LARMANDE P, et al. An analysis of the performance of representation learning methods for entity alignment: benchmark vs. real-world data[J/OL]. Semantic Web Journal (by IOS Press) [2024-02-12].. |
[14] | AUER S, BIZER C, KOBILAROV G, et al. DBpedia: a nucleus for a web of open data[C]// Proceedings of the 2007 Asian Semantic Web Conference International Semantic Web Conference, LNCS 4825. Berlin: Springer, 2007: 722-735. |
[15] | LIANG P, CHEN Y, SUN Y, et al. An information entropy-driven evolutionary algorithm based on reinforcement learning for many-objective optimization[J]. Expert Systems with Applications, 2024, 238(Pt E): No.122164. |
[16] | AHMETAJ S, EFTHYMIU V, FAGIN R, et al. Ontology-enriched query answering on relational databases[C]// Proceedings of the 35th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2021: 15247-15254. |
[17] | TARUS J K, NIU Z, MUSTAFA G. Knowledge-based recommendation: a review of ontology-based recommender systems for e-learning[J]. Artificial Intelligence Review, 2018, 50(1): 21-48. |
[18] | BORDES A, USUNIER N, GARCIA-DURAN A, et al. Translating embeddings for modeling multi-relational data[C]// Proceedings of the 27th International Conference on Neural Information Processing Systems. New York: ACM, 2013: 2787-2795. |
[19] | CHEN M, TIAN Y, YANG M, et al. Multilingual knowledge graph embeddings for cross-lingual knowledge alignment[C]// Proceedings of the 26th International Joint Conference on Artificial Intelligence. California: ijcai.org, 2017: 1511-1517. |
[20] | WANG Y, TANG W, SUN H, et al. Understanding and guiding weakly supervised entity alignment with potential isomorphism propagation[EB/OL]. [2024-09-23].. |
[21] | ZHANG R, SU Y, TRISEDYA B D, et al. AutoAlign: fully automatic and effective knowledge graph alignment enabled by large language models[J]. IEEE Transactions on Knowledge and Data Engineering, 2024, 36(6): 2357-2371. |
[22] | JI G, LIU K, HE S, et al. Knowledge graph completion with adaptive sparse transfer matrix[C]// Proceedings of the 30th AAAI Conference on Artificial Intelligence. New York: ACM, 2016: 985-991. |
[23] | HU W, ZHANG Q, SUN Z, et al. MultiKE: a multi-view knowledge graph embedding framework for entity alignment[C]// Proceedings of the 14th International Workshop on Ontology Matching co-located with the 18th International Semantic Web Conference. [S. l.]: CEUR-WS.org, 2019: 189-190. |
[24] | GUAN S, JIN X, WANG Y, et al. Self-learning and embedding based entity alignment[C]// Proceedings of the 2017 IEEE International Conference on Big Knowledge. Piscataway: IEEE, 2017: 33-40. |
[1] | Jinggang LYU, Shaorui PENG, Shuo GAO, Jin ZHOU. Speech enhancement network driven by complex frequency attention and multi-scale frequency enhancement [J]. Journal of Computer Applications, 2025, 45(9): 2957-2965. |
[2] | Weigang LI, Jiale SHAO, Zhiqiang TIAN. Point cloud classification and segmentation network based on dual attention mechanism and multi-scale fusion [J]. Journal of Computer Applications, 2025, 45(9): 3003-3010. |
[3] | Xiang WANG, Zhixiang CHEN, Guojun MAO. Multivariate time series prediction method combining local and global correlation [J]. Journal of Computer Applications, 2025, 45(9): 2806-2816. |
[4] | Jin ZHOU, Yuzhi LI, Xu ZHANG, Shuo GAO, Li ZHANG, Jiachuan SHENG. Modulation recognition network for complex electromagnetic environments [J]. Journal of Computer Applications, 2025, 45(8): 2672-2682. |
[5] | Haifeng WU, Liqing TAO, Yusheng CHENG. Partial label regression algorithm integrating feature attention and residual connection [J]. Journal of Computer Applications, 2025, 45(8): 2530-2536. |
[6] | Chao JING, Yutao QUAN, Yan CHEN. Improved multi-layer perceptron and attention model-based power consumption prediction algorithm [J]. Journal of Computer Applications, 2025, 45(8): 2646-2655. |
[7] | Jinhao LIN, Chuan LUO, Tianrui LI, Hongmei CHEN. Thoracic disease classification method based on cross-scale attention network [J]. Journal of Computer Applications, 2025, 45(8): 2712-2719. |
[8] | Chen LIANG, Yisen WANG, Qiang WEI, Jiang DU. Source code vulnerability detection method based on Transformer-GCN [J]. Journal of Computer Applications, 2025, 45(7): 2296-2303. |
[9] | Yihan WANG, Chong LU, Zhongyuan CHEN. Multimodal sentiment analysis model with cross-modal text information enhancement [J]. Journal of Computer Applications, 2025, 45(7): 2237-2244. |
[10] | Haoyu LIU, Pengwei KONG, Yaoli WANG, Qing CHANG. Pedestrian detection algorithm based on multi-view information [J]. Journal of Computer Applications, 2025, 45(7): 2325-2332. |
[11] | Xiaoqiang ZHAO, Yongyong LIU, Yongyong HUI, Kai LIU. Batch process quality prediction model using improved time-domain convolutional network with multi-head self-attention mechanism [J]. Journal of Computer Applications, 2025, 45(7): 2245-2252. |
[12] | Huibin WANG, Zhan’ao HU, Jie HU, Yuanwei XU, Bo WEN. Time series forecasting model based on segmented attention mechanism [J]. Journal of Computer Applications, 2025, 45(7): 2262-2268. |
[13] | Sheping ZHAI, Yan HUANG, Qing YANG, Rui YANG. Multi-view entity alignment combining triples and text attributes [J]. Journal of Computer Applications, 2025, 45(6): 1793-1800. |
[14] | Xiang WANG, Qianqian CUI, Xiaoming ZHANG, Jianchao WANG, Zhenzhou WANG, Jialin SONG. Wireless capsule endoscopy image classification model based on improved ConvNeXt [J]. Journal of Computer Applications, 2025, 45(6): 2016-2024. |
[15] | Yuan SONG, Xin CHEN, Yarong LI, Yongwei LI, Yang LIU, Zhen ZHAO. Single-channel speech separation model based on auditory modulation Siamese network [J]. Journal of Computer Applications, 2025, 45(6): 2025-2033. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||