Bimodal fusion for constructing spatiotemporal knowledge graph in smart home space

doi:10.11772/j.issn.1001-9081.2025010114

Abstract

Abstract: The development of the smart home field was reliant on the construction of a rich spatial-temporal knowledge graph to support the design and execution of downstream tasks. However, constructing a spatio-temporal knowledge graph of smart home space was faced with challenges such as diverse data sources, low data quality, and limited scale. Therefore, a dual-modal knowledge extraction framework integrating the relative location information of description documents and user behavior logs was proposed to fully mine multi-modal information in device description documents and user behavior logs to achieve efficient and accurate knowledge extraction and graph construction. The framework was composed of two parts: First, a method based on Relative Position Layout Matching (RPLM) was proposed, which was applied to leverage the relative position characteristics of device description documents to be correlated and matched with image and text in device description documents. At the same time, the ontology model of description documents were designed and integrated with the Large Language Model (LLM) to extract structured information and construct the knowledge graph of description documents. Secondly, the functional correlation analysis algorithm (FCA) and the device usage behavior processing algorithm (DUBP) were designed to extract function association device information from user behavior logs and construct the spatio-temporal knowledge graph of family space. Finally, LayoutLMv3, ERNIE-Layout, GeoLayoutLM, etc. were selected as benchmark models, and the verification was carried out based on a self-built Chinese Manual Document Layout Analysis dataset (CMDLA), a synthesized user behavior log dataset, and three public document analysis datasets. The experimental results demonstrate that the framework outperforms existing methods in both the accuracy and efficiency of knowledge extraction, and exhibits significant advantages in heterogeneous data fusion and spatiotemporal modeling tasks. In experiments on the smart home domain dataset, the proposed method achieves an accuracy of 96.39%, 0.97 percentage points higher than the second-best model GeoLayoutLM.

Key words: smart home, device description document, behavior log, knowledge graph, multimodal fusion, knowledge extraction

摘要： 智慧家庭领域发展依赖于构建丰富的时空知识图谱支撑下游任务设计与执行。然而，构建智慧家庭空间的时空知识图谱面临数据源多样、数据质量低以及规模有限等挑战。为了解决以上问题，文中提出了一种融合说明文档相对位置信息与用户行为日志的双模态知识提取框架，充分挖掘设备说明文档和用户行为日志中的多模态信息，实现高效的知识提取与图谱构建。框架包括两部分：首先，提出一个基于相对位置布局匹配的方法(RPLM)，利用说明文档的相对位置特性，对设备说明文档中的图像和文本进行关联匹配，同时设计说明文档的本体模型，与大语言模型(LLM)融合，提取结构化信息构建说明文档知识图谱；其次，设计了功能关联分析算法(FCA)和设备使用行为处理算法(DUBP)，从用户行为日志中提取功能关联的设备信息并构建家庭空间的时空知识图谱；最后，选取LayoutLMv3、ERNIE-Layout和GeoLayoutLM等作为基准模型，基于一个自建中文说明文档数据集(CMDLA)和合成的用户行为日志数据集，以及三个公开文档分析数据集进行验证。实验结果表明，所提框架在家庭领域知识提取的准确性和效率上优于基线方法，在异构数据融合与时空建模任务中表现出显著优势，在智慧家庭领域数据集上所提方法的准确率达到96.39%，较次优模型GeoLayoutLM提高0.97个百分点。

关键词: 智能家庭, 设备说明文档, 行为日志, 知识图谱, 多模态融合, 知识抽取

CLC Number:

51-1307/TP

王菲陶冶刘家旺李伟秦修功张宁. 面向智慧家庭空间时空知识图谱的双模态融合构建方法[J]. 《计算机应用》唯一官方网站, DOI: 10.11772/j.issn.1001-9081.2025010114.

[1]	Chun XU, Shuangyan JI, Huan MA, Enwei SUN, Mengmeng WANG, Mingyu SU. Consultation recommendation method based on knowledge graph and dialogue structure [J]. Journal of Computer Applications, 2025, 45(4): 1157-1168.
[2]	Zixin XU, Xiuwen YI, Jie BAO, Tianrui LI, Junbo ZHANG, Yu ZHENG. Construction and application of knowledge graph for epidemiological investigation [J]. Journal of Computer Applications, 2025, 45(4): 1340-1348.
[3]	Sheping ZHAI, Qing YANG, Yan HUANG, Rui YANG. Knowledge graph completion using hierarchical attention fusing directed relationships and relational paths [J]. Journal of Computer Applications, 2025, 45(4): 1148-1156.
[4]	Liqin WANG, Zhilei GENG, Yingshuang LI, Yongfeng DONG, Meng BIAN. Open-world knowledge reasoning model based on path and enhanced triplet text [J]. Journal of Computer Applications, 2025, 45(4): 1177-1183.
[5]	Yan YANG, Feng YE, Dong XU, Xuejie ZHANG, Jin XU. Construction of digital twin water conservancy knowledge graph integrating large language model and prompt learning [J]. Journal of Computer Applications, 2025, 45(3): 785-793.
[6]	Chengzhe YUAN, Guohua CHEN, Dingding LI, Yuan ZHU, Ronghua LIN, Hao ZHONG, Yong TANG. ScholatGPT： a large language model for academic social networks and its intelligent applications [J]. Journal of Computer Applications, 2025, 45(3): 755-764.
[7]	Xuefei ZHANG, Liping ZHANG, Sheng YAN, Min HOU, Yubo ZHAO. Personalized learning recommendation in collaboration of knowledge graph and large language model [J]. Journal of Computer Applications, 2025, 45(3): 773-784.
[8]	Qijian CAI, Wei TAN. Semantic graph enhanced multi-modal recommendation algorithm [J]. Journal of Computer Applications, 2025, 45(2): 421-427.
[9]	Meng WANG, Daqian ZHANG, Bingyan ZHOU, Qianying MA, Jidong LYU. Fault diagnosis method for train control on-board interface equipment of CTCS-3 based on temporal knowledge graph completion [J]. Journal of Computer Applications, 2025, 45(2): 677-684.
[10]	Xueqiang LYU, Tao WANG, Xindong YOU, Ge XU. HTLR： named entity recognition framework with hierarchical fusion of multi-knowledge [J]. Journal of Computer Applications, 2025, 45(1): 40-47.
[11]	Zidong CHENG, Peng LI, Feng ZHU. Potential relation mining in internet of things threat intelligence knowledge graph [J]. Journal of Computer Applications, 2025, 45(1): 24-31.
[12]	Rui LI, Guanfeng LI, Dezhou HU, Wenxin GAO. Knowledge graph multi-hop reasoning model fusing path and subgraph features [J]. Journal of Computer Applications, 2025, 45(1): 32-39.
[13]	Guixiang XUE, Hui WANG, Weifeng ZHOU, Yu LIU, Yan LI. Port traffic flow prediction based on knowledge graph and spatio-temporal diffusion graph convolutional network [J]. Journal of Computer Applications, 2024, 44(9): 2952-2957.
[14]	Ying HUANG, Jiayu YANG, Jiahao JIN, Bangrui WAN. Siamese mixed information fusion algorithm for RGBT tracking [J]. Journal of Computer Applications, 2024, 44(9): 2878-2885.
[15]	Jie WU, Ansi ZHANG, Maodong WU, Yizong ZHANG, Congbao WANG. Overview of research and application of knowledge graph in equipment fault diagnosis [J]. Journal of Computer Applications, 2024, 44(9): 2651-2659.

Bimodal fusion for constructing spatiotemporal knowledge graph in smart home space

面向智慧家庭空间时空知识图谱的双模态融合构建方法

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics