Multimodal knowledge graph representation learning： a review

doi:10.11772/j.issn.1001-9081.2023050583

Journal of Computer Applications ›› 2024, Vol. 44 ›› Issue (1): 1-15.DOI: 10.11772/j.issn.1001-9081.2023050583

• Cross-media representation learning and cognitive reasoning • Next Articles

Multimodal knowledge graph representation learning： a review

Chunlei WANG¹^,², Xiao WANG¹(), Kai LIU³

^1.Institute of Artificial Intelligence，Shanghai University，Shanghai 200444，China
^2.Shanghai Artificial Intelligence Laboratory，Shanghai 200232，China
^3.School of Computer Engineering and Science，Shanghai University，Shanghai 200444，China

Received:2023-05-15 Revised:2023-06-23 Accepted:2023-06-28 Online:2023-08-01 Published:2024-01-10
Contact: Xiao WANG
About author:WANG Chunlei， born in 1977， Ph. D.， research fellow. His research interests include knowledge graph and cognitive intelligence， affective computing and emotion recognition.
LIU Kai， born in 1996， M. S. candidate. His research interests include knowledge graph.
Supported by:
National Science Fund for Distinguished Young Scholars(62225308)

多模态知识图谱表示学习综述

王春雷¹^,², 王肖¹(), 刘凯³

^1.上海大学人工智能研究院, 上海 200444
^2.上海人工智能实验室, 上海 200232
^3.上海大学计算机工程与科学学院, 上海 200444

通讯作者: 王肖
作者简介:王春雷（1977—），男，江苏盐城人，研究员，博士，CCF会员，主要研究方向：知识图谱与认知智能、情感计算与情绪识别；
刘凯（1996—），男，江西抚州人，硕士研究生，主要研究方向：知识图谱。
第一联系人：王肖（1998—），男，安徽亳州人，硕士研究生，主要研究方向：多模态知识图谱；
基金资助:
国家杰出青年科学基金资助项目(62225308)

Abstract

Abstract:

By comprehensively comparing the models of traditional knowledge graph representation learning， including the advantages and disadvantages and the applicable tasks， the analysis shows that the traditional single-modal knowledge graph cannot represent knowledge well. Therefore， how to use multimodal data such as text， image， video， and audio for knowledge graph representation learning has become an important research direction. At the same time， the commonly used multimodal knowledge graph datasets were analyzed in detail to provide data support for relevant researchers. On this basis， the knowledge graph representation learning models under multimodal fusion of text， image， video， and audio were further discussed， and various models were summarized and compared. Finally， the effect of multimodal knowledge graph representation on enhancing classical applications， including knowledge graph completion， question answering system， multimodal generation and recommendation system in practical applications was summarized， and the future research work was prospected.

Key words: multimodal knowledge graph, representation learning, multimodal fusion, knowledge graph completion, multimodal generation

摘要：

在综合对比传统知识图谱表示学习模型优缺点以及适用任务后，发现传统的单一模态知识图谱无法很好地表示知识。因此，如何利用文本、图片、视频、音频等多模态数据进行知识图谱表示学习成为一个重要的研究方向。同时，详细分析了常用的多模态知识图谱数据集，为相关研究人员提供数据支持。在此基础上，进一步讨论了文本、图片、视频、音频等多模态融合下的知识图谱表示学习模型，并对其中各种模型进行了总结和比较。最后，总结了多模态知识图谱表示学习如何改善经典应用，包括知识图谱补全、问答系统、多模态生成和推荐系统在实际应用中的效果，并对未来的研究工作进行了展望。

关键词: 多模态知识图谱, 表示学习, 多模态融合, 知识图谱补全, 多模态生成

CLC Number:

TP182

Chunlei WANG, Xiao WANG, Kai LIU. Multimodal knowledge graph representation learning： a review[J]. Journal of Computer Applications, 2024, 44(1): 1-15.

王春雷, 王肖, 刘凯. 多模态知识图谱表示学习综述[J]. 《计算机应用》唯一官方网站, 2024, 44(1): 1-15.

Figures/Tables 17

References 88

1	SINGHAL A. Introducing the knowledge graph： things， not strings ［EB/OL］. （2012-05-16）［2023-03-12］. .
2	YUHAS B P， GOLDSTEIN M H， SEJNOWSKI T J. Integration of acoustic and visual speech signals using neural networks ［J］. IEEE Communications Magazine， 1989， 27（11）： 65-71. 10.1109/35.41402
3	BALTRUŠAITIS T， AHUJA C， L-P MORENCY. Multimodal machine learning： a survey and taxonomy ［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2019， 41（2）： 423-443. 10.1109/tpami.2018.2798607
4	JI S， PAN S， CAMBRIA E， et al. A survey on knowledge graphs： representation， acquisition， and applications ［J］. IEEE Transactions on Neural Networks and Learning Systems， 2022， 33（2）： 494-514. 10.1109/tnnls.2021.3070843
5	BORDES A， USUNIER N， GARCIA-DURÁN A， et al. Translating embeddings for modeling multi-relational data ［C］// Proceedings of the 26th International Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2013， 2： 2787-2795.
6	LIN Y， LIU Z， SUN M， et al. Learning entity and relation embeddings for knowledge graph completion ［C］// Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence. Menlo Park： AAAI Press， 2015： 2181-2187. 10.1609/aaai.v29i1.9491
7	WANG Z， ZHANG J， FENG J， et al. Knowledge graph embedding by translating on hyperplanes ［C］// Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence. Menlo Park： AAAI Press， 2014： 1112-1119. 10.1609/aaai.v28i1.8870
8	JI G， HE S， XU L， et al. Knowledge graph embedding via dynamic mapping matrix ［C］// Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing （Volume 1： Long Papers）. Stroudsburg， PA： Association for Computational Linguistics， 2015： 687-696. 10.3115/v1/p15-1067
9	XIAO H， HUANG M， HAO Y， et al. TransA： an adaptive approach for knowledge graph embedding ［C/OL］// Proceedings of the 2015 AAAI Conference on Artificial Intelligence. ［2023-01-05］. .
10	NICKEL M， TRESP V， H-P KRIEGEL. A three-way model for collective learning on multi-relational data ［C］// Proceedings of the 28th International Conference on International Conference on Machine Learning. Red Hook： Omnipress， 2011： 809-816. 10.1145/2187836.2187874
11	JENATTON R， ROUX N， BORDES A， et al. A latent factor model for highly multi-relational data ［C］// Proceedings of the 25th International Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2012， 2： 3167-3175.
12	BALAŽEVIĆ I， ALLEN C， HOSPEDALES T. TuckER： tensor factorization for knowledge graph completion ［C］// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Stroudsburg， PA： Association for Computational Linguistics， 2019： 5185-5194. 10.18653/v1/d19-1522
13	BORDES A， WESTON J， COLLOBERT R， et al. Learning structured embeddings of knowledge bases ［C］// Proceedings of the 25th AAAI Conference on Artificial Intelligence. Menlo Park： AAAI Press， 2011： 301-306. 10.1609/aaai.v25i1.7917
14	YANG B， S-W YIH， HE X， et al. Embedding entities and relations for learning and inference in knowledge bases ［EB/OL］. ［2023-01-05］. .
15	DETTMERS T， MINERVINI P， STENETORP P， et al. Convolutional 2D knowledge graph embeddings ［C］// Proceedings of the 32nd AAAI Conference on Artificial Intelligence and 30th Innovative Applications of Artificial Intelligence Conference and 8th AAAI Symposium on Educational Advances in Artificial Intelligence. Menlo Park： AAAI Press， 2018， 32（1）： 1811-1818. 10.1609/aaai.v32i1.11573
16	NGUYEN D Q， NGUYEN T D， NGUYEN D Q， et al. A novel embedding model for knowledge base completion based on convolutional neural network ［C］// Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics： Human Language Technologies， Volume 2（Short Papers）. Stroudsburg， PA： Association for Computational Linguistics， 2018： 327-333. 10.18653/v1/n18-2053
17	XIE Z， ZHOU G， LIU J， et al. ReInceptionE： relation-aware inception network with joint local-global structural information for knowledge graph embedding ［C］// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg， PA： Association for Computational Linguistics， 2020： 5929-5939. 10.18653/v1/2020.acl-main.526
18	GUO L， SUN Z， HU W. Learning to exploit long-term relational dependencies in knowledge graphs ［J］. Proceedings of Machine Learning Research， 2019， 97： 2505-2514. 10.1162/dint_a_00016
19	MEZNI H， BENSLIMANE D， BELLATRECHE L. Context-aware service recommendation based on knowledge graph embedding ［J］. IEEE Transactions on Knowledge and Data Engineering， 2022， 34（11）： 5225-5238. 10.1109/tkde.2021.3059506
20	SCHLICHTKRULL M， KIPF T N， BLOEM P， et al. Modeling relational data with graph convolutional networks ［C］// Proceedings of the 2018 European Semantic Web Conference. Cham： Springer， 2018： 593-607. 10.1007/978-3-319-93417-4_38
21	LIU X， TAN H， CHEN Q， et al. RAGAT： Relation aware graph attention network for knowledge graph completion ［J］. IEEE Access， 2021， 9： 20840-20849. 10.1109/access.2021.3055529
22	LI Z， LIU H， ZHANG Z， et al. Learning knowledge graph embedding with heterogeneous relation attention networks ［J］. IEEE Transactions on Neural Networks and Learning Systems， 2022， 33（8）： 3961-3973. 10.1109/tnnls.2021.3055147
23	VASWANI A， SHAZEER N， PARMAR N， et al. Attention is all you need ［C］// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2017： 6000-6010.
24	DEVLIN J， CHANG M-W， LEE K， et al. BERT： Pre-training of deep bidirectional transformers for language understanding ［C］// Proceedings of the 2019 Conference on the North American Chapter of the Association for Computational Linguistics： Human Language Technologies. Stroudsburg， PA： Association for Computational Linguistics， 2019， 1： 4171-4186.
25	WANG Q， HUANG P， WANG H， et al. CoKE： contextualized knowledge graph embedding ［EB/OL］. （2020-04-04）［2023-03-25］. .
26	YAO L， MAO C， LUO Y. KG-BERT： BERT for knowledge graph completion ［EB/OL］. （2019-09-11）［2023-04-03］. .
27	CHEN S， LIU X， GAO J， et al. HittER： hierarchical transformers for knowledge graph embeddings ［C］// Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Stroudsburg， PA： Association for Computational Linguistics， 2021： 10395-10407. 10.18653/v1/2021.emnlp-main.812
28	ALAM M M， RONY M R A H， NAYYERI M， et al. Language model guided knowledge graph embeddings ［J］. IEEE Access， 2022， 10： 76008-76020. 10.1109/access.2022.3191666
29	CHEN Y， GE X， YANG S， et al. A survey on multimodal knowledge graphs： construction， completion and applications ［J］. Mathematics， 2023， 11（8）： 1815. 10.3390/math11081815
30	NIU Y， TANG K， ZHANG H， et al. Counterfactual VQA： a cause-effect look at language bias ［C］// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2021： 12695-12705. 10.1109/cvpr46437.2021.01251
31	ZHAO W， HU Y， WANG H， et al. Boosting entity-aware image captioning with multi-modal knowledge graph ［EB/OL］. （2021-07-26）［2023-04-12］. . 10.1109/tmm.2023.3301279
32	LIANG K， MENG L， LIU M， et al. Reasoning over different types of knowledge graphs： static， temporal and multi-modal ［EB/OL］. （2023-05-27）［2023-06-16］. .
33	WANG M， WANG S， YANG H， et al. Is visual context really helpful for knowledge graph？ A representation learning perspective ［C］// Proceedings of the 29th ACM International Conference on Multimedia. New York： ACM， 2021： 2735-2743. 10.1145/3474085.3475470
34	SUN R， CAO X， ZHAO Y， et al. Multi-modal knowledge graphs for recommender systems ［C］// Proceedings of the 29th ACM International Conference on Information & Knowledge Management. New York： ACM， 2020： 1405-1414. 10.1145/3340531.3411947
35	XU G， CHEN H， LI F L， et al. AliMe， MKG： A multi-modal knowledge graph for live-streaming e-commerce ［C］// Proceedings of the 30th ACM International Conference on Information & Knowledge Management. New York： ACM， 2021： 4808-4812. 10.1145/3459637.3481983
36	LEHMANN J， ISELE R， JAKOB M， et al. DBpedia — a large-scale， multilingual knowledge base extracted from Wikipedia ［J］. Semantic Web， 2015， 6（2）： 167-195. 10.3233/sw-140134
37	CHEN X， SHRIVASTAVA A， GUPTA A. NEIL： extracting visual knowledge from web data ［C］// Proceedings of the 2013 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2013： 1409-1416. 10.1109/iccv.2013.178
38	VRANDEČIĆ D， KRÖTZSCH M. Wikidata： a free collaborative knowledgebase ［J］. Communications of the ACM， 2014， 57（10）： 78-85. 10.1145/2629489
39	FERRADA S， BUSTOS B， HOGAN A. IMGpedia： a linked dataset with content-based analysis of Wikimedia images ［C］// Proceedings of the 2017 International Semantic Web Conference. Cham： Springer， 2017： 84-93. 10.1007/978-3-319-68204-4_8
40	LIU Z， WANG S， ZHENG L， et al. Robust ImageGraph： rank-level feature fusion for image search ［J］. IEEE Transactions on Image Processing， 2017， 26（7）： 3128-3141. 10.1109/tip.2017.2660244
41	LIU Y， LI H， GARCIA-DURAN A， et al. MMKG： multi-modal knowledge graphs ［C］// Proceedings of the 2019 European Semantic Web Conferenc. Cham： Springer， 2019： 459-474. 10.1007/978-3-030-21348-0_30
42	LI M， ZAREIAN A， LIN Y， et al. GAIA： A fine-grained multimedia knowledge extraction system ［C］// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics： System Demonstrations. Stroudsburg， PA： Association for Computational Linguistics， 2020： 77-86. 10.18653/v1/2020.acl-demos.11
43	KANNAN A V， FRADKIN D， AKROTIRIANAKIS I， et al. Multimodal knowledge graph for deep learning papers and code ［C］// Proceedings of the 29th ACM International Conference on Information & Knowledge Management. New York： ACM， 2020： 3417-3420. 10.1145/3340531.3417439
44	WANG M， WANG H， QI G， et al. Richpedia： a large-scale， comprehensive multi-modal knowledge graph ［J］. Big Data Research， 2020， 22： 100159. 10.1016/j.bdr.2020.100159
45	ALBERTS H， HUANG N， DESHPANDE Y， et al. VisualSem： a high-quality knowledge graph for vision and language ［C］// Proceedings of the 1st Workshop on Multilingual Representation Learning. Stroudsburg， PA： Association for Computational Linguistics， 2021： 138-152. 10.18653/v1/2021.mrl-1.13
46	BLOEM P， WILCKE X， VAN BERKEL L， et al. kgbench： A collection of knowledge graph datasets for evaluating relational and multimodal machine learning ［C］// Proceedings of the 2021 European Semantic Web Conference. Cham： Springer， 2021： 614-630. 10.1007/978-3-030-77385-4_37
47	WANG Z， LI L， LI Q， et al. Multimodal data enhanced representation learning for knowledge graphs ［C］// Proceedings of the 2019 International Joint Conference on Neural Networks. Piscataway： IEEE， 2019： 1-8. 10.1109/ijcnn.2019.8852079
48	WANG Z， ZHANG J， FENG J， et al. Knowledge graph and text jointly embedding ［C］// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg， PA： Association for Computational Linguistics， 2014： 1591-1601. 10.3115/v1/d14-1167
49	TOUTANOVA K， CHEN D， PANTEL P， et al. Representing text for joint embedding of text and knowledge bases ［C］// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Stroudsburg， PA： Association for Computational Linguistics， 2015： 1499-1509. 10.18653/v1/d15-1174
50	RIEDEL S， YAO L， McCALLUM A， et al. Relation extraction with matrix factorization and universal schemas ［C］// Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics： Human Language Technologies. Stroudsburg， PA： Association for Computational Linguistics， 2013： 74-84.
51	WANG Z， LI J. Text-enhanced representation learning for knowledge graph ［C］// Proceedings of the 25th International Joint Conference on Artificial Intelligent. Menlo Park： AAAI Press， 2016： 1293-1299.
52	XIE R， LIU Z， JIA J， et al. Representation learning of knowledge graphs with entity descriptions ［C］// Proceedings of the 30th AAAI Conference on Artificial Intelligence. Menlo Park： AAAI Press， 2016： 2659-2665. 10.1609/aaai.v30i1.10329
53	XIE R， LIU Z， SUN M. Representation learning of knowledge graphs with hierarchical types ［C］// Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence. Menlo Park： AAAI Press， 2016： 2965-2971. 10.24963/ijcai.2017/438
54	XIAO H， HUANG M， MENG L， et al. SSP： semantic space projection for knowledge graph embedding with text descriptions ［C］// Proceedings of the 31st AAAI Conference on Artificial Intelligence. Menlo Park： AAAI Press， 2017： 3104-3110. 10.1609/aaai.v31i1.10952
55	AN B， CHEN B， HAN X， et al. Accurate text-enhanced knowledge graph representation learning ［C］// Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics： Human Language Technologies， Volume 1（Long Papers）. Stroudsburg， PA： Association for Computational Linguistics， 2018： 745-755. 10.18653/v1/n18-1068
56	CHEN M， TIAN Y， CHANG K-W， et al. Co-training embeddings of knowledge graphs and entity descriptions for cross-lingual entity alignment ［C］// Proceedings of the 27th International Joint Conference on Artificial Intelligence. Menlo Park： AAAI Press， 2018： 3998-4004. 10.24963/ijcai.2018/556
57	BOLLACKER K， EVANS C， PARITOSH P， et al. Freebase： a collaboratively created graph database for structuring human knowledge ［C］// Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data. New York： ACM， 2008： 1247-1250. 10.1145/1376616.1376746
58	Wikipedia. Wikipedia ［M］. ［S.l.］： PediaPress， 2004.
59	BORDES A， GLOROT X， WESTON J， et al. A semantic matching energy function for learning with multi-relational data： Application to word-sense disambiguation ［J］. Machine Learning， 2014， 94： 233-259. 10.1007/s10994-013-5363-6
60	MILLER D. On Nationality ［M］. Oxford： Clarendon Press， 1995. 10.2307/2945854
61	LI Z， FENG S， SHI J， et al. Future event prediction based on temporal knowledge graph embedding ［J］. Computer Systems Science and Engineering， 2023， 44（3）： 2411-2423. 10.32604/csse.2023.026823
62	SOCHER R， CHEN D， MANNING C D， et al. Reasoning with neural tensor networks for knowledge base completion ［C］// Proceedings of the 26th International Conference on Neural Information Processing Systems， Red Hook： Curran Associates Inc.， 2013： 926-934.
63	XIE R， LIU Z， LUAN H， et al. Image-embodied knowledge representation learning ［C］// Proceedings of the 26th International Joint Conference on Artificial Intelligence. Menlo Park： AAAI Press， 2017： 3140-3146. 10.24963/ijcai.2017/438
64	MOUSSELLY-SERGIEH H， BOTSCHEN T， GUREVYCH I， et al. A multimodal translation-based approach for knowledge graph representation learning ［C］// Proceedings of the Seventh Joint Conference on Lexical and Computational Semantics. Stroudsburg， PA： Association for Computational Linguistics， 2018： 225-234. 10.18653/v1/s18-2027
65	LONIJ V P A， RAWAT A， M-I NICOLAE. Extending knowledge bases using images ［C/OL］// Proceedings of the 31st Conference on Neural Information Processing Systems. ［2023-01-05］. .
66	LIU F， CHEN M， ROTH D， et al. Visual pivoting for （unsupervised） entity alignment ［C］// Proceedings of the 35th AAAI Conference on Artificial Intelligence. Menlo Park： AAAI Press， 2021： 4257-4266. 10.1609/aaai.v35i5.16550
67	LU X， WANG L， JIANG Z， et al. MMKRL： A robust embedding approach for multi-modal knowledge graph representation learning ［J］. Applied Intelligence， 2022， 52： 7480-7497. 10.1007/s10489-021-02693-9
68	LIANG S， ZHU A， ZHANG J， et al. Hyper-node relational graph attention network for multi-modal knowledge graph completion ［J］. ACM Transactions on Multimedia Computing， Communications and Applications， 2023， 19（2）： Article No.62. 10.1145/3545573
69	DENG J， DONG W， SOCHER R， et al. ImageNet： A large-scale hierarchical image database ［C］// Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2009： 248-255. 10.1109/cvpr.2009.5206848
70	RUSSAKOVSKY O， DENG J， SU H， et al. ImageNet large scale visual recognition challenge ［J］. International Journal of Computer Vision， 2015， 115： 211-252. 10.1007/s11263-015-0816-y
71	SUN Z， HU W， LI C. Cross-lingual entity alignment via joint attribute-preserving embedding ［C］// Proceedings of the 2017 International Semantic Web Conference. Cham： Springer， 2017： 628-644. 10.1007/978-3-319-68288-4_37
72	TOUTANOVA K， CHEN D. Observed versus latent features for knowledge base and text inference ［C］// Proceedings of the 3rd Workshop on Continuous Vector Space Models and their Compositionality. Stroudsburg， PA： Association for Computational Linguistics， 2015： 57-66. 10.18653/v1/w15-4007
73	JIN D， QI Z， LUO Y， et al. TransFusion： multi-modal fusion for video tag inference via translation-based knowledge embedding ［C］// Proceedings of the 29th ACM International Conference on Multimedia. New York： ACM， 2021： 1093-1101. 10.1145/3474085.3481535
74	SHAN Y， HOENS T R， JIAO J， et al. Deep crossing： web-scale modeling without manually crafted combinatorial features ［C］// Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York： ACM， 2016： 255-262. 10.1145/2939672.2939704
75	DENG J， SHEN D， PAN H， et al. A unified model for video understanding and knowledge embedding with heterogeneous knowledge graph dataset ［EB/OL］. （2023-04-02）［2023-04-19］. . 10.1145/3591106.3592258
76	LI N， SHEN Q， SONG R， et al. MEduKG： A deep-learning-based approach for multi-modal educational knowledge graph construction ［J］. Information， 2022， 13（2）： Article No. 91. 10.3390/info13020091
77	PEZESHKPOUR P， CHEN L， SINGH S. Embedding multimodal relational data for knowledge base completion ［C］// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Stroudsburg， PA： Association for Computational Linguistics， 2018： 3208-3218. 10.18653/v1/d18-1359
78	CHEN L， LI Z， WANG Y， et al. MMEA： entity alignment for multi-modal knowledge graph ［C］// Proceedings of the 2020 International Conference on Knowledge Science， Engineering and Management. Cham： Springer， 2020： 134-147. 10.1007/978-3-030-55130-8_12
79	HARPER F M， KONSTAN J A. The MovieLens datasets： History and context ［J］. ACM Transactions on Interactive Intelligent Systems， 2015， 5（4）： Article No. 19. 10.1145/2827872
80	PANDIT H J， DEBRUYNE C， O’SULLIVAN D， et al. GConsent — a consent ontology based on the GDPR ［C］// Proceedings of the 2019 European Semantic Web Conference. Cham： Springer， 2019： 270-282. 10.1007/978-3-030-21348-0_18
81	WILCKE W X， BLOEM P， DE BOER V， et al. End-to-End entity classification on multimodal knowledge graphs ［EB/OL］. （2020-05-25）［2023-05-02］. .
82	GUO H， TANG J， ZENG W， et al. Multi-modal entity alignment in hyperbolic space ［J］. Neurocomputing， 2021， 461： 598-607. 10.1016/j.neucom.2021.03.132
83	CHEN D， LI Z， GU B， et al. Multimodal named entity recognition with image attributes and image knowledge ［C］// Proceedings of the 2021 International Conference on Database Systems for Advanced Applications. Cham： Springer， 2021： 186-201. 10.1007/978-3-030-73197-7_12
84	YU J， ZHU Z， WANG Y， et al. Cross-modal knowledge reasoning for knowledge-based visual question answering ［J］. Pattern Recognition， 2020， 108： 107563. 10.1016/j.patcog.2020.107563
85	SHI B， JI L， LU P， et al. Knowledge aware semantic concept expansion for image-text matching ［C/OL］// Proceedings of the 28th International Joint Conference on Artificial Intelligence. ［2023-01-05］. . 10.24963/ijcai.2019/720
86	CHAUDHARY C， GOYAL P， PRASAD D N， et al. Enhancing the quality of image tagging using a visio-textual knowledge base ［J］. IEEE Transactions on Multimedia， 2020， 22（4）： 897-911. 10.1109/tmm.2019.2937181
87	TAO S， QIU R， PING Y， et al. Multi-modal knowledge-aware reinforcement learning network for explainable recommendation ［J］. Knowledge-Based Systems， 2021， 227： 107217. 10.1016/j.knosys.2021.107217
88	ZHANG C， ZHANG C， ZHENG S， et al. A complete survey on generative AI （AIGC）： is ChatGPT from GPT-4 to GPT-5 all you need？［EB/OL］. （2023-03-21）［2023-05-23］. .

模型基础	具体模型	适用任务	优点	缺点
基于翻译	TransE	链接预测	简单直观，计算复杂度不高	不适用于复杂关系建模
	TransH	链接预测、三元组分类、事实提取	克服了TransE不适用于复杂关系建模的缺陷，让实体在多关系的情况下仍能获得合适的表示	实体关系在同一个空间，对某些场景不适用
	TransR	链接预测、三元组分类、关系事实提取	对表示空间进行了改进，考虑了实体和关系投影在不同空间中的情况	复杂度较高，参数急剧增加，并且投影矩阵仅和关系有关
	TransD	三元组分类、链接预测	将投影矩阵分解成两个，解决了在同一关系下头尾实体投影矩阵相同的问题	模型训练过程复杂，需要较多的计算资源
因子分解	RESCAL	链接预测、三元组分类	能够通过模型的潜在组件执行集体学习，并提供计算因子分解的有效算法	模型参数较多，复杂性较高，计算需要很多的资源，并行性不高
因子分解	TuckER	链接预测	模型表达能力强大，可以将正例与反例完全区分开；模型是线性的，相对简单	不适用于一般预测任务，模型方程和优化算法针对单个任务单独导出，适用性不强
CNN	ConvE	链接预测、实习关系预测	表现力强、参数效率高	不能捕获输入实体和关系的交互，仅在输入实体和关系的邻接矩阵中建模交互
	ConvKB	链接预测	对ConvE进行优化，将ConvE中的大小调整操作更改为串联，保留了转移特征	仅独立考虑三元组，无法覆盖三元组周围的局部邻居中固有的复杂和隐藏信息
	ReInceptionE	链接预测	考虑ConvE交互次数受限，提出Inception增强交互；基于KBGAT的缺点，提出充分利用局部和全局结构信息的嵌入模型	超参数太多，对于不同的数据集需要重新调参
RNN	RSN	实体对齐、知识图谱补全	通过跳过机制弥合实体之间差距，增加残差学习以有效地捕获KG内部和KG之间的长期关系依赖	需要大量的计算资源和高质量的数据训练，并且在实际应用中，模型的缺点会比它的优点更为突出
RNN	DRNN	上下文感知推荐	基于一阶和子图感知邻近的原理，提出了一种上下文感知知识图嵌入，提高了准确性和可扩展性	虽然减少了大量参数，但是也带来了信息损失，在较长的序列中会造成梯度逐渐消失或者爆炸
GNN	R-GCN	链接预测、实体分类	增加了聚合关系的维度，使节点的聚合操作变成一个双重聚合的过程，从而增加表征知识的能力	对于学习参数和学习关系没有轻重之分，忽略了不同关系之间的区别
GNN+attention	RAGAT	知识图谱补全	优化不同关系之间的区别，增加注意力机制，充分利用知识图的异构特性	参数量多，训练方式没有利用高阶邻居，容易发生过度平滑
Transformer	CoKE	链接预测、路径查询	允许每个实体或关系使用单个静态表示，使用Transformer编码器获得上下文化表示；学习动态适应每个输入序列的KG嵌入，捕捉实体和其中关系的上下文含义	模型太大太深导致学习参数过多，需要更多的计算资源
Transformer	HittER	链接预测	底部块提取源实体的本地邻域中每个实体关系对的特征；顶部块从底部块的输出中聚合关系信息，更好地提取丰富的语义知识	可解释性差，引入丰富的上下文信息由于其中包含虚假信息，会降低对原实体信息的表示效果，并且可能导致过拟合
BERT	KG-BERT	三元组分类	通过增加上下文信息的嵌入来捕获丰富的语义知识	模型复杂性变高，虽然模型效果提高，但是可解释性大幅下降

模型基础	具体模型	适用任务	优点	缺点
基于翻译	TransE	链接预测	简单直观，计算复杂度不高	不适用于复杂关系建模
	TransH	链接预测、三元组分类、事实提取	克服了TransE不适用于复杂关系建模的缺陷，让实体在多关系的情况下仍能获得合适的表示	实体关系在同一个空间，对某些场景不适用
	TransR	链接预测、三元组分类、关系事实提取	对表示空间进行了改进，考虑了实体和关系投影在不同空间中的情况	复杂度较高，参数急剧增加，并且投影矩阵仅和关系有关
	TransD	三元组分类、链接预测	将投影矩阵分解成两个，解决了在同一关系下头尾实体投影矩阵相同的问题	模型训练过程复杂，需要较多的计算资源
因子分解	RESCAL	链接预测、三元组分类	能够通过模型的潜在组件执行集体学习，并提供计算因子分解的有效算法	模型参数较多，复杂性较高，计算需要很多的资源，并行性不高
因子分解	TuckER	链接预测	模型表达能力强大，可以将正例与反例完全区分开；模型是线性的，相对简单	不适用于一般预测任务，模型方程和优化算法针对单个任务单独导出，适用性不强
CNN	ConvE	链接预测、实习关系预测	表现力强、参数效率高	不能捕获输入实体和关系的交互，仅在输入实体和关系的邻接矩阵中建模交互
	ConvKB	链接预测	对ConvE进行优化，将ConvE中的大小调整操作更改为串联，保留了转移特征	仅独立考虑三元组，无法覆盖三元组周围的局部邻居中固有的复杂和隐藏信息
	ReInceptionE	链接预测	考虑ConvE交互次数受限，提出Inception增强交互；基于KBGAT的缺点，提出充分利用局部和全局结构信息的嵌入模型	超参数太多，对于不同的数据集需要重新调参
RNN	RSN	实体对齐、知识图谱补全	通过跳过机制弥合实体之间差距，增加残差学习以有效地捕获KG内部和KG之间的长期关系依赖	需要大量的计算资源和高质量的数据训练，并且在实际应用中，模型的缺点会比它的优点更为突出
RNN	DRNN	上下文感知推荐	基于一阶和子图感知邻近的原理，提出了一种上下文感知知识图嵌入，提高了准确性和可扩展性	虽然减少了大量参数，但是也带来了信息损失，在较长的序列中会造成梯度逐渐消失或者爆炸
GNN	R-GCN	链接预测、实体分类	增加了聚合关系的维度，使节点的聚合操作变成一个双重聚合的过程，从而增加表征知识的能力	对于学习参数和学习关系没有轻重之分，忽略了不同关系之间的区别
GNN+attention	RAGAT	知识图谱补全	优化不同关系之间的区别，增加注意力机制，充分利用知识图的异构特性	参数量多，训练方式没有利用高阶邻居，容易发生过度平滑
Transformer	CoKE	链接预测、路径查询	允许每个实体或关系使用单个静态表示，使用Transformer编码器获得上下文化表示；学习动态适应每个输入序列的KG嵌入，捕捉实体和其中关系的上下文含义	模型太大太深导致学习参数过多，需要更多的计算资源
Transformer	HittER	链接预测	底部块提取源实体的本地邻域中每个实体关系对的特征；顶部块从底部块的输出中聚合关系信息，更好地提取丰富的语义知识	可解释性差，引入丰富的上下文信息由于其中包含虚假信息，会降低对原实体信息的表示效果，并且可能导致过拟合
BERT	KG-BERT	三元组分类	通过增加上下文信息的嵌入来捕获丰富的语义知识	模型复杂性变高，虽然模型效果提高，但是可解释性大幅下降

模型	基础数据集	适用任务	效果说明	改进分析
pTransE	Freebase^［57］、Wikipedia^［58］、 NY Times	三元组分类、改进关系抽取、类比推理任务	与TransE与word2vec（Skip Gram）相当或稍好，主要为了解决推理新关系事实	基于TransE进行相应改变，实体和单词联合嵌入并对同一空间中实体和单词进行对齐以达到更好的嵌入效果
CONV	FB15K-237^［15］	文本推理	对具有文本提及的实体对有更大的改进，提高了链接预测性能	在Riedel等^［50］的子结构上进行建模并在相关依赖路径之间共享参数，使用统一的损失函数学习实体和关系表示，与基本模型的区别在于文本提及的参数化
TEKE	FB13^［7］、WN18^［59］、FB15K^［5］、WN11^［8］	链接预测、三元组分类	链接预测任务中的Hits@10指标明显且一致地优于其他基线，TEKE_H比TEKE_R表现略好。三元组分类任中，TEKE_E和TEKE_H始终优于其他基线，其中上述三个模型是基于TransE、TransR、TransH实现的不同TEKE模型	基于TransE、TransH和TransR但在不同的优化目标上实现，该模型构建了一个基于实体注释文本语料库的共现网络用于将知识和文本信息连接在一起
DKRL	FB15K^［5］、FB20K（基于FB15K）	知识图谱补全、实体类型分类	在现有的基于翻译的模型中zero-shot场景下达到最好效果；在实体分类任务中zero-shot场景下也有很大优势	基于TransE进行改变，实体的嵌入既对相应的事实三元组进行建模，也对其描述进行建模，同时利用事实三元组和实体描述
TKRL	FB15K^［5］、FB15K+（基于FB15K）	知识图谱补全、三元组分类	在知识图谱补全任务和三元组分类中比TransE和TransR效果更好，增强相同类型实体之间的差异对三元组分类任务性能有明显提高	基于DKRL增加多层实体类型进行模型的改进，模型设计了两个类型编码器来建模分层结构
SSP	FB15K^［5］、FB20K（基于FB15K）、WordNet^［60］、Freebase^［61］	知识图谱补全、实体分类	在知识图谱补全任务中优于其他基线，在实体分类任务中达到最优；与TransE和DKRL相比有着更高的精度	描述之间交互过程中增加两个因子用于平衡嵌入向量中文本描述和三元组信息之间的权重，同时进行主题模型和嵌入模型以此来共同学习语义和嵌入
AATE、ATE	WN11^［8］、WN18^［62］、FB13^［7］、FB18K、Wikipedia^［58］	链接预测、三元组分类	链接预测任务中AATE和ATE完全优于所有基线。AATE比ATE取得更好的结果。三元组分类任务中AATE同样优于所有基线，AATE比ATE更加提高了所有数据集的准确性	基于BiLSTM对关系和实体进行编码，提出相互关注机制学习关系和实体的更准确文本表示，模型由嵌入层、BiLSTM层和相互关注层组成
KDCoE	WK3160K（基于DBpedia^［36］）	跨语言实体对齐、跨语言知识图谱补全	跨语言实体对齐中KDCoE的最后阶段超过了所有基线，在跨语言知识图谱补全中KDCoE-mono的表现至少与TransE相当，这表明KDCoE很好地保留了单语KG结构的特征	对多语种KG嵌入模型（Knowledge Graph Embedding Model， KGEM）和多语种字面描述嵌入模型（Descripton Embedding Model， DEM）进行联合训练，KGEM采用TransE，DEM使用门控递归单元编码器（Attentive Gate Recurrent Unit， AGRU）来编码多语言实体描述
KG-BERT	WN11^［5］、FB13^［7］、FB15K^［5］、WN18RR^［15］、FB15K-237^［15］、UMLS^［15］	三元组分类、链接预测，关系预测	三个任务中KG-BERT的效果基本优于所有基线，但链路预测任务非常耗时，在该任务中几乎所有实体都要替换头部或尾部实体	模型初始化选BERT Base在此基础上再进行微调，从而对三元组进行建模表示

模型	基础数据集	适用任务	效果说明	改进分析
pTransE	Freebase^［57］、Wikipedia^［58］、 NY Times	三元组分类、改进关系抽取、类比推理任务	与TransE与word2vec（Skip Gram）相当或稍好，主要为了解决推理新关系事实	基于TransE进行相应改变，实体和单词联合嵌入并对同一空间中实体和单词进行对齐以达到更好的嵌入效果
CONV	FB15K-237^［15］	文本推理	对具有文本提及的实体对有更大的改进，提高了链接预测性能	在Riedel等^［50］的子结构上进行建模并在相关依赖路径之间共享参数，使用统一的损失函数学习实体和关系表示，与基本模型的区别在于文本提及的参数化
TEKE	FB13^［7］、WN18^［59］、FB15K^［5］、WN11^［8］	链接预测、三元组分类	链接预测任务中的Hits@10指标明显且一致地优于其他基线，TEKE_H比TEKE_R表现略好。三元组分类任中，TEKE_E和TEKE_H始终优于其他基线，其中上述三个模型是基于TransE、TransR、TransH实现的不同TEKE模型	基于TransE、TransH和TransR但在不同的优化目标上实现，该模型构建了一个基于实体注释文本语料库的共现网络用于将知识和文本信息连接在一起
DKRL	FB15K^［5］、FB20K（基于FB15K）	知识图谱补全、实体类型分类	在现有的基于翻译的模型中zero-shot场景下达到最好效果；在实体分类任务中zero-shot场景下也有很大优势	基于TransE进行改变，实体的嵌入既对相应的事实三元组进行建模，也对其描述进行建模，同时利用事实三元组和实体描述
TKRL	FB15K^［5］、FB15K+（基于FB15K）	知识图谱补全、三元组分类	在知识图谱补全任务和三元组分类中比TransE和TransR效果更好，增强相同类型实体之间的差异对三元组分类任务性能有明显提高	基于DKRL增加多层实体类型进行模型的改进，模型设计了两个类型编码器来建模分层结构
SSP	FB15K^［5］、FB20K（基于FB15K）、WordNet^［60］、Freebase^［61］	知识图谱补全、实体分类	在知识图谱补全任务中优于其他基线，在实体分类任务中达到最优；与TransE和DKRL相比有着更高的精度	描述之间交互过程中增加两个因子用于平衡嵌入向量中文本描述和三元组信息之间的权重，同时进行主题模型和嵌入模型以此来共同学习语义和嵌入
AATE、ATE	WN11^［8］、WN18^［62］、FB13^［7］、FB18K、Wikipedia^［58］	链接预测、三元组分类	链接预测任务中AATE和ATE完全优于所有基线。AATE比ATE取得更好的结果。三元组分类任务中AATE同样优于所有基线，AATE比ATE更加提高了所有数据集的准确性	基于BiLSTM对关系和实体进行编码，提出相互关注机制学习关系和实体的更准确文本表示，模型由嵌入层、BiLSTM层和相互关注层组成
KDCoE	WK3160K（基于DBpedia^［36］）	跨语言实体对齐、跨语言知识图谱补全	跨语言实体对齐中KDCoE的最后阶段超过了所有基线，在跨语言知识图谱补全中KDCoE-mono的表现至少与TransE相当，这表明KDCoE很好地保留了单语KG结构的特征	对多语种KG嵌入模型（Knowledge Graph Embedding Model， KGEM）和多语种字面描述嵌入模型（Descripton Embedding Model， DEM）进行联合训练，KGEM采用TransE，DEM使用门控递归单元编码器（Attentive Gate Recurrent Unit， AGRU）来编码多语言实体描述
KG-BERT	WN11^［5］、FB13^［7］、FB15K^［5］、WN18RR^［15］、FB15K-237^［15］、UMLS^［15］	三元组分类、链接预测，关系预测	三个任务中KG-BERT的效果基本优于所有基线，但链路预测任务非常耗时，在该任务中几乎所有实体都要替换头部或尾部实体	模型初始化选BERT Base在此基础上再进行微调，从而对三元组进行建模表示

模型	基础数据集	适用任务	效果说明	改进分析
IKRL	WN9-IMG（基于WN18^［59］、WordNe^［60］、ImageNet^［69］）	知识图谱补全、三元组分类	在知识图谱补全任务和三元组分类中整体质量上显著优于基线，实验结果表明图像信息可以提供补充信息，并且基于注意力机制可以联合考虑多个实例	基于翻译模型的整体架构，该架构结合了结构化知识信息和视觉信息，并利用神经图像编码器和实例级注意力的方式来联合学习图像和结构的表示
MTKGRL	WN9-IMG^［63］、FB-IMG（基于FB15K^［5］）	链接预测、三元组分类	在链接预测任务中优于IKRL模型。在三元组分类任务中模型更有效利用了多模态信息，与TransE相比平均精度提高超过1个百分点	基于翻译的表示学习模型，通过融合视觉和语言信息，扩展了三元组的定义从而建立新的多模态表示
NTL改进模型	WN1M（基于WordNet^［60］）、ILSVRC-2012^［70］	类别属性预测	在三个数据集上都明显优于基线INTL模型。在开放世界（OW）案例中，其性能仅比零样本（ZS）数据集稍差；在1K数据集上的性能较低	训练了两种知识图谱嵌入函数，一种基于原始NTL架构，另一种基于改进的平滑SNTL模型。对于图像嵌入，使用VGG16架构，训练了两个嵌入模型，分别以NTL和SNTL的实体向量为目标
TransAE	WN9-IMG-TXT （基于WN9-IMG^［63］）	链接预测、三元组分类	链接预测任务中TransAE模型优于所有基线模型，与IKRL和DKRL相比有一定的优势。三元组分类任务中TransAE效果最佳，大多数情况下，准确性足够高，可以将知识图中的正三元组与负三元组区分开来	将Multimodal autoencoder和TransE结合，同时学习多模态知识和结构知识。提取视觉和文本特征向量，将这些向量输入多模态自动编码器，得到联合嵌入作为实体表示
RSME	WN18-IMG（基于WN18^［59］）、FB15K-IMG（基于FB15K^［5］）、WN18-UMG-S（基于WN18-IMG）、FB15K-IMG-S （基于FB-15K-IMG）	链接预测	RSME的性能优于所有其他模型。RSME（VIT）和RSME（No Img）之间的差异比较显著，表明视觉上下文的引入确实有帮助。RSME（VIT）和RSME（VIT+Forget）的对比也说明forget gate在大多数情况下确实有进一步的提升	由一个基本的KG嵌入模型和三个门（过滤门、遗忘门和融合门）组成。使用过滤门来自动过滤不相关的图像，图像通过遗忘门来增强有益的特征。在遗忘门之后，视觉信息和KG结构信息在融合门中融合，最后通过最小化损失函数获得实体和关系的嵌入
EVA	DBP15K^［71］、DWY15K^［18］	实体对齐	半监督EVA在两个EA基准测试中获得了新的SOTA，大幅超过了以前的模型。无监督EVA达到了大于70.0%的准确率	利用视觉相似性创造初始种子字典，提供了一个完全无监督的解决方案。通过多模态嵌入学习过程和对齐学习过程联合以解决实体对齐任务
HRGAT	FB15K-237^［72］、WN18RR^［15］、DB15K^［41］、YAGO15K^［41］	多模态知识图谱补全	HRGAT与其他基线相比，多数评测指标都为最高。与四个基础模型比较，优于所有传统知识图谱嵌入模型。HRGAT对于多模态知识图的补全任务达到了高质量水平	HRGAT主要包括信息融合模块（通过预训练嵌入和低秩多模态融合融合多模态特征），信息聚合模块（捕获多模态知识图谱中的结构信息），预测块（用于多模态知识图谱补全）
MMKRL	WN9-IMG^［63］、FB-IMG^［64］	链接预测、三元组分类	链接预测中MMKRL与文中多模态KRL模型相比较，除Raw指标，其他所有指标均为最高。三元组分类中MMKRL明显优于所有模型，在FB-IMG数据集上效果最好	MMKRL主要分成两个模块，其中知识重构模块中使用不同预训练编码模型对各种知识进行嵌入以此重构多模态知识图谱，而AT模块中使用联合学习框架学习结构化和多模态表示

Multimodal knowledge graph representation learning： a review

多模态知识图谱表示学习综述

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 17

References 88

Related Articles 15

Recommended Articles

Metrics

模型	基础数据集	适用任务	效果说明	改进分析
TransFusion	FB15K^［5］、ImageNet^［69］	视频标签推理	TransFusion变体都优于基本模型TransE，包括没有任何预训练视频嵌入的TransFusion-0。集成任何额外组合（单一、双重或三重）模态的TransFusion的性能明显优于TransFusion-0和所有基线	是部分可训练的模型，通过融合KG嵌入和多种模态的预训练视频嵌入。结合预定义的评分函数，融合的视频嵌入用于导出语义关系的嵌入，这些嵌入进一步用于推断标签，作为常规链接预测任务
基于CLIP的模型^［75］	CN-DBpedia等自制数据集	视频关系标签（VRT）、视频关系视频（VRV）任务	该模型在HITS@10上分别在VRV和VRT任务中获得了303.4%和30.2%的大幅提升，优于所有基于两阶段KGE的模型	模型首先对视频编码器进行视频理解训练，通过基于CLIP的模型将视频嵌入投影到相同的标签嵌入空间中，最后在一个模型中联合优化KGE、CLIP和视频理解目标

[1]	Shunyong LI, Shiyi LI, Rui XU, Xingwang ZHAO. Incomplete multi-view clustering algorithm based on self-attention fusion [J]. Journal of Computer Applications, 2024, 44(9): 2696-2703.
[2]	Yu DU, Yan ZHU. Constructing pre-trained dynamic graph neural network to predict disappearance of academic cooperation behavior [J]. Journal of Computer Applications, 2024, 44(9): 2726-2731.
[3]	Tingjie TANG, Jiajin HUANG, Jin QIN, Hui LU. Session-based recommendation based on graph co-occurrence enhanced multi-layer perceptron [J]. Journal of Computer Applications, 2024, 44(8): 2357-2364.
[4]	Shibin LI, Jun GONG, Shengjun TANG. Semi-supervised heterophilic graph representation learning model based on Graph Transformer [J]. Journal of Computer Applications, 2024, 44(6): 1816-1823.
[5]	Yunhua ZHU, Bing KONG, Lihua ZHOU, Hongmei CHEN, Chongming BAO. Multi-view clustering network guided by graph contrastive learning [J]. Journal of Computer Applications, 2024, 44(10): 3267-3274.
[6]	Junhao LUO, Yan ZHU. Multi-dynamic aware network for unaligned multimodal language sequence sentiment analysis [J]. Journal of Computer Applications, 2024, 44(1): 79-85.
[7]	Mu LI, Yuheng YANG, Xizheng KE. Emotion recognition model based on hybrid-mel gama frequency cross-attention transformer modal [J]. Journal of Computer Applications, 2024, 44(1): 86-93.
[8]	Wei TONG, Liyang HE, Rui LI, Wei HUANG, Zhenya HUANG, Qi LIU. Efficient similar exercise retrieval model based on unsupervised semantic hashing [J]. Journal of Computer Applications, 2024, 44(1): 206-216.
[9]	Qiang ZHAO, Zhongqing WANG, Hongling WANG. Product summarization extraction model with multimodal information fusion [J]. Journal of Computer Applications, 2024, 44(1): 73-78.
[10]	Yirui HUANG, Junwei LUO, Jingqiang CHEN. Multi-modal dialog reply retrieval based on contrast learning and GIF tag [J]. Journal of Computer Applications, 2024, 44(1): 32-38.
[11]	Jinghong WANG, Zhixia ZHOU, Hui WANG, Haokang LI. Attribute network representation learning with dual auto-encoder [J]. Journal of Computer Applications, 2023, 43(8): 2338-2344.
[12]	Zelin XU, Min YANG, Meng CHEN. Point-of-interest category representation model with spatial and textual information [J]. Journal of Computer Applications, 2023, 43(8): 2456-2461.
[13]	Kun ZHANG, Fengyu YANG, Fa ZHONG, Guangdong ZENG, Shijian ZHOU. Source code vulnerability detection based on hybrid code representation [J]. Journal of Computer Applications, 2023, 43(8): 2517-2526.
[14]	Kun FU, Yuhan HAO, Minglei SUN, Yinghua LIU. Network representation learning based on autoencoder with optimized graph structure [J]. Journal of Computer Applications, 2023, 43(10): 3054-3061.
[15]	Yizhen BI, Huan MA, Changqing ZHANG. Dynamic evaluation method for benefit of modality augmentation [J]. Journal of Computer Applications, 2023, 43(10): 3099-3106.