《计算机应用》唯一官方网站 ›› 2026, Vol. 46 ›› Issue (1): 52-59.DOI: 10.11772/j.issn.1001-9081.2025010114
王菲1, 陶冶1(
), 刘家旺1, 李伟2, 秦修功3, 张宁4
收稿日期:2025-02-07
修回日期:2025-04-06
接受日期:2025-04-08
发布日期:2026-01-10
出版日期:2026-01-10
通讯作者:
陶冶
作者简介:王菲(1999—),女,山东潍坊人,硕士研究生,主要研究方向:自然语言处理、知识图谱基金资助:
Fei WANG1, Ye TAO1(
), Jiawang LIU1, Wei LI2, Xiugong QIN3, Ning ZHANG4
Received:2025-02-07
Revised:2025-04-06
Accepted:2025-04-08
Online:2026-01-10
Published:2026-01-10
Contact:
Ye TAO
About author:WANG Fei, born in 1999, M. S. candidate. Her research interests include natural language processing, knowledge graph.Supported by:摘要:
智慧家庭领域的发展依赖于构建丰富的时空知识图谱支撑下游任务的设计与执行。然而,构建智慧家庭空间的时空知识图谱面临数据源多样、数据质量低以及规模有限等挑战。因此,提出一种融合说明文档相对位置信息与用户行为日志的双模态知识提取框架来充分挖掘设备说明文档和用户行为日志中的多模态信息,从而高效地实现知识提取与图谱构建。该框架包括两部分:首先,提出一个基于相对位置布局匹配(RPLM)的方法,以利用说明文档的相对位置特性来对设备说明文档中的图像和文本进行关联匹配,同时设计说明文档的本体模型,并与大语言模型(LLM)融合,提取结构化信息并构建说明文档知识图谱;其次,设计功能关联分析(FCA)算法和设备使用行为处理(DUBP)算法,从用户行为日志中提取功能关联的设备信息并构建家庭空间的时空知识图谱。选取LayoutLMv3、ERNIE-Layout和GeoLayoutLM等作为基准模型,并在一个自建中文说明文档布局分析(CMDLA)数据集和合成的用户行为日志数据集以及3个公开文档分析数据集上进行验证。结果表明,所提框架在家庭领域数据集上的知识提取准确性和效率上优于基线方法,准确率达到96.39%,比次优方法GeoLayoutLM提高了0.97个百分点,在异构数据融合与时空建模任务中表现出显著优势。
中图分类号:
王菲, 陶冶, 刘家旺, 李伟, 秦修功, 张宁. 面向智慧家庭空间的时空知识图谱的双模态融合构建方法[J]. 计算机应用, 2026, 46(1): 52-59.
Fei WANG, Ye TAO, Jiawang LIU, Wei LI, Xiugong QIN, Ning ZHANG. Bimodal fusion method for constructing spatio-temporal knowledge graph in smart home space[J]. Journal of Computer Applications, 2026, 46(1): 52-59.
图1 融合说明文档相对位置信息与用户行为日志的双模态知识提取框架
Fig. 1 Bimodal knowledge extraction framework integrating relative location information of description documents and user behavior logs
| 方法 | F1 | 准确率 | ||
|---|---|---|---|---|
| FUNSD | SROIE | RVL-CDIP | CMDLA | |
| BERT[ | 65.63 | 90.25 | — | 67.32 |
| LayoutLM[ | 78.95 | 94.93 | — | 79.28 |
| LayoutLMv2[ | 82.76 | 94.95 | 95.25 | 89.95 |
| LayoutLMv3[ | 90.29 | 96.56 | 95.44 | 92.34 |
| ERNIE-Layout[ | 93.12 | 95.64 | 93.14 | |
| GeoLayoutLM[ | 91.58 | 97.97 | 96.15 | |
| 本文方法 | 97.18 | 96.39 | ||
表1 不同数据集上信息抽取性能的比较 ( %)
Tab. 1 Comparison of information extraction performance on different datasets
| 方法 | F1 | 准确率 | ||
|---|---|---|---|---|
| FUNSD | SROIE | RVL-CDIP | CMDLA | |
| BERT[ | 65.63 | 90.25 | — | 67.32 |
| LayoutLM[ | 78.95 | 94.93 | — | 79.28 |
| LayoutLMv2[ | 82.76 | 94.95 | 95.25 | 89.95 |
| LayoutLMv3[ | 90.29 | 96.56 | 95.44 | 92.34 |
| ERNIE-Layout[ | 93.12 | 95.64 | 93.14 | |
| GeoLayoutLM[ | 91.58 | 97.97 | 96.15 | |
| 本文方法 | 97.18 | 96.39 | ||
| 方法 | CMDLA准确率/% |
|---|---|
| ERNIE-Layout[ | 89.26 |
| GeoLayoutLM[ | 95.92 |
| 本文方法 | 96.39 |
表2 不同方法在CMDLA数据集上的准确率对比
Tab. 2 Comparison of accuracy of different methods on CMDLA dataset
| 方法 | CMDLA准确率/% |
|---|---|
| ERNIE-Layout[ | 89.26 |
| GeoLayoutLM[ | 95.92 |
| 本文方法 | 96.39 |
| 方法 | CMDLA准确率/% |
|---|---|
| w/o 相对位置融合模块 | 90.67 |
| w/o 本体模型融合模块 | 95.24 |
| 本文方法 | 96.39 |
表3 CMDLA数据集上的消融实验结果
Tab. 3 Ablation experimental results on CMDLA dataset
| 方法 | CMDLA准确率/% |
|---|---|
| w/o 相对位置融合模块 | 90.67 |
| w/o 本体模型融合模块 | 95.24 |
| 本文方法 | 96.39 |
| α | 准确率/% | 召回率/% | F1/% |
|---|---|---|---|
| 0.7 | 88.3 | 95.2 | 91.6 |
| 0.8 | 94.1 | 90.5 | 92.3 |
| 0.9 | 96.8 | 82.1 | 88.8 |
表4 不同α值对功能关联分析的影响
Tab. 4 Impact of different α values on functional association analysis
| α | 准确率/% | 召回率/% | F1/% |
|---|---|---|---|
| 0.7 | 88.3 | 95.2 | 91.6 |
| 0.8 | 94.1 | 90.5 | 92.3 |
| 0.9 | 96.8 | 82.1 | 88.8 |
| 平均处理时间/s | 小概率事件误删率/% | |
|---|---|---|
| 50 | 12.3±0.5 | 18.7 |
| 60 | 13.1±0.6 | 8.2 |
| 70 | 14.5±0.7 | 5.4 |
表5 不同b值下的时间消耗与误删率的对比
Tab. 5 Comparison of time consumption and false deletion rate under different b values
| 平均处理时间/s | 小概率事件误删率/% | |
|---|---|---|
| 50 | 12.3±0.5 | 18.7 |
| 60 | 13.1±0.6 | 8.2 |
| 70 | 14.5±0.7 | 5.4 |
| [1] | STOLOJESCU-CRISAN C, CRISAN C, BUTUNOI B P. An IoT-based smart home automation system [J]. Sensors, 2021, 21(11): No.3784. |
| [2] | ORFANOS V A, KAMINARIS S D, PAPAGEORGAS P, et al. A comprehensive review of IoT networking technologies for smart home automation applications [J]. Journal of Sensor and Actuator Networks, 2023, 12(2): No.30. |
| [3] | 杜永杰,王杰,陈天璐,等.基于知识塔群的智慧家庭场景自生成技术研究[J].家电科技, 2024(S1): 120-124. |
| DU Y J, WANG J, CHEN T L, et al. Self-generation of smart home scenarios based on knowledge tower clusters [J]. Journal of Appliance Science and Technology, 2024(S1): 120-124. | |
| [4] | NIRANJANA R, ARVIND S, VIGNESH M, et al. Effectual home automation using ESP32 NodeMCU [C]// Proceedings of the 2022 International Conference on Automation, Computing and Renewable Systems. Piscataway: IEEE, 2022: 1-5. |
| [5] | CHAVIS J S, BUCZAK A, RUBIN A, et al. Connected Home Automated Security Monitor (CHASM): protecting IoT through application of machine learning [C]// Proceedings of the 10th Annual Computing and Communication Workshop and Conference. Piscataway: IEEE, 2020: 684-690. |
| [6] | AMRU M, KANNAN R J, GANESH E N, et al. Network intrusion detection system by applying ensemble model for smart home [J]. International Journal of Electrical and Computer Engineering, 2024, 14(3): 3485-3494. |
| [7] | VARDAKIS G, HATZIVASILIS G, KOUTSAKI E, et al. Review of smart-home security using the Internet of Things [J]. Electronics, 2024, 13(16): No.3343. |
| [8] | RAZA A, LI J, GHADI Y, et al. Smart home energy management systems: research challenges and survey [J]. Alexandria Engineering Journal, 2024, 92: 117-170. |
| [9] | JRHILIFA I, OUADI H, JILBAB A, et al. Forecasting smart home electricity consumption using VMD-Bi-GRU [J]. Energy Efficiency, 2024, 17(4): No.35. |
| [10] | BRUSH A J B, LEE B, MAHAJAN R, et al. Home automation in the wild: challenges and opportunities [C]// Proceedings of the 2011 SIGCHI Conference on Human Factors in Computing Systems. New York: ACM, 2011: 2115-2124. |
| [11] | SOVACOOL B K, FURSZYFER DEL RIO D D. Smart home technologies in Europe: a critical review of concepts, benefits, risks and policies [J]. Renewable and Sustainable Energy Reviews, 2020, 120: No.109663. |
| [12] | HUDA N U, AHMED I, ADNAN M, et al. Experts and intelligent systems for smart homes’ Transformation to Sustainable Smart Cities: a comprehensive review [J]. Expert Systems with Applications, 2024, 238(Pt F): No.122380. |
| [13] | TAO Y, LIU J, LI H, et al. KFEX-N: a table-text data question-answering model based on knowledge-fusion encoder and EX-N tree decoder [J]. Neurocomputing, 2024, 593: No.127795. |
| [14] | MOHAMED S K, NOUNU A, NOVÁČEK V. Biological applications of knowledge graph embedding models [J]. Briefings in Bioinformatics, 2021, 22(2): 1679-1693. |
| [15] | WANG J, WANG X, MA C, et al. A survey on the development status and application prospects of knowledge graph in smart grids [J]. IET Generation, Transmission and Distribution, 2021, 15(3): 383-407. |
| [16] | SON J Y, PARK J H, MOON K D, et al. Resource-aware smart home management system by constructing resource relation graph [J]. IEEE Transactions on Consumer Electronics, 2011, 57(3): 1112-1119. |
| [17] | WANG T, CHEN W, LIU L, et al. Detecting smart home automation application interferences with domain knowledge [C]// Proceedings of the 38th IEEE/ACM International Conference on Automated Software Engineering. Piscataway: IEEE, 2023: 1086-1097. |
| [18] | LI W, WANG J, JIAO S, et al. Augmented assembly work instruction knowledge graph for adaptive presentation [C]// Proceedings of the 2021 International Conference on Intelligent Robotics and Applications, LNCS 13013. Cham: Springer, 2021: 793-803. |
| [19] | ZHANG Z, AI Q, YAN J, et al. Knowledge graph construction method of bridge design codes based on ontology and specification parsing [C]// Proceedings of the 2024 Asia Simulation Conference, CCIS 2170. Singapore: Springer, 2024: 58-69. |
| [20] | LI J, LIU S, LIU A, et al. Knowledge graph construction for SOFL formal specifications [J]. International Journal of Software Engineering and Knowledge Engineering, 2022, 32(4): 605-644. |
| [21] | FENG X, ZHANG Y, MENG M H, et al. Detecting contradictions from IoT protocol specification documents based on neural generated knowledge graph [J]. ISA Transactions, 2023, 141: 10-19. |
| [22] | WANG Z, PAN J S, CHEN Q, et al. BiLSTM-CRF-KG: a construction method of software requirements specification graph [J]. Applied Sciences, 2022, 12(12): No.6016. |
| [23] | XU Y, LI M, CUI L, et al. LayoutLM: pre-training of text and layout for document image understanding [C]// Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2020: 1192-1200. |
| [24] | XU Y, XU Y, LV T, et al. LayoutLMv2: multi-modal pre-training for visually-rich document understanding [C]// Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Stroudsburg: ACL, 2021: 2579-2591. |
| [25] | HUANG Y, LV T, CUI L, et al. Layoutlmv3: pre-training for document AI with unified text and image masking [C]// Proceedings of the 30th ACM International Conference on Multimedia. New York: ACM, 2022: 4083-4091. |
| [26] | PENG Q, PAN Y, WANG W, et al. ERNIE-Layout: layout knowledge enhanced pre-training for visually-rich document understanding [C]// Findings of the Association for Computational Linguistics: EMNLP 2022. Stroudsburg: ACL, 2022: 3744-3756. |
| [27] | LUO C, CHENG C, ZHENG Q, et al. GeoLayoutLM: geometric pre-training for visual information extraction [C]// Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2023: 7092-7101. |
| [28] | LIU Y D, ZHANG X, HE S L, et al. UniParser: a unified log parser for heterogeneous log data [C]// Proceedings of the ACM Web Conference 2022. New York: ACM, 2022: 1893-1901. |
| [29] | WANG X H, ZHANG X, LI L Q, et al. SPINE: a scalable log parser with feedback guidance [C]// Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering. New York: ACM, 2022: 1198-1208. |
| [30] | HE P, ZHU J, HE S, et al. Towards automated log parsing for large-scale log data analysis [J]. IEEE Transactions on Dependable and Secure Computing, 2018, 15(6): 931-944. |
| [31] | LEE Y, KIM J, KANG P. LAnoBERT: system log anomaly detection based on BERT masked language model [J]. Applied Soft Computing, 2023, 146: No.110689. |
| [32] | DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional Transformers for language understanding [C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long and Short Papers). Stroudsburg: ACL, 2019: 4171-4186. |
| [33] | JAUME G, EKENEL H K, THIRAN J P. FUNSD: a dataset for form understanding in noisy scanned documents [C]// Proceedings of the 2019 International Conference on Document Analysis and Recognition Workshops — Volume 2. Piscataway: IEEE, 2019: 1-6. |
| [34] | HUANG Z, CHEN K, HE J, et al. ICDAR2019 competition on scanned receipt OCR and information extraction [C]// Proceedings of the 2019 International Conference on Document Analysis and Recognition. Piscataway: IEEE, 2019: 1516-1520. |
| [35] | HARLEY A W, UFKES A, DERPANIS K G. Evaluation of deep convolutional nets for document image classification and retrieval [C]// Proceedings of the 13th International Conference on Document Analysis and Recognition. Piscataway: IEEE, 2015: 991-995. |
| [36] | PAN Z, WU F, ZHANG B. Fine-grained image-text matching by cross-modal hard aligning network [C]// Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2023: 19275-19284. |
| [37] | TRAJANOSKA M, STOJANOV R, TRAJANOV D. Enhancing knowledge graph construction using large language models [EB/OL]. [2024-05-08]. . |
| [38] | CHEN H, CAO G, CHEN J, et al. A practical framework for evaluating the quality of knowledge graph [C]// Proceedings of the 2019 China Conference on Knowledge Graph and Semantic Computing, CCIS 1134. Singapore: Springer, 2019: 111-122. |
| [1] | 李亚男, 郭梦阳, 邓国军, 陈允峰, 任建吉, 原永亮. 基于多模态融合特征的并分支发动机寿命预测方法[J]. 《计算机应用》唯一官方网站, 2026, 46(1): 305-313. |
| [2] | 刘超, 余岩化. 融合降噪策略与多视图对比学习的知识感知推荐模型[J]. 《计算机应用》唯一官方网站, 2025, 45(9): 2827-2837. |
| [3] | 刘爽, 刘大庆, 孟佳娜, 赵迪. 融合噪声过滤的超关系知识图谱补全方法[J]. 《计算机应用》唯一官方网站, 2025, 45(6): 1817-1826. |
| [4] | 徐梓芯, 易修文, 鲍捷, 李天瑞, 张钧波, 郑宇. 面向流行病学调查的知识图谱构建与应用[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1340-1348. |
| [5] | 翟社平, 杨晴, 黄妍, 杨锐. 融合有向关系与关系路径的层次注意力的知识图谱补全[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1148-1156. |
| [6] | 王利琴, 耿智雷, 李英双, 董永峰, 边萌. 基于路径和增强三元组文本的开放世界知识推理模型[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1177-1183. |
| [7] | 徐春, 吉双焱, 马欢, 孙恩威, 王萌萌, 苏明钰. 基于知识图谱和对话结构的问诊推荐方法[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1157-1168. |
| [8] | 张学飞, 张丽萍, 闫盛, 侯敏, 赵宇博. 知识图谱与大语言模型协同的个性化学习推荐[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 773-784. |
| [9] | 杨燕, 叶枫, 许栋, 张雪洁, 徐津. 融合大语言模型和提示学习的数字孪生水利知识图谱构建[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 785-793. |
| [10] | 袁成哲, 陈国华, 李丁丁, 朱源, 林荣华, 钟昊, 汤庸. ScholatGPT:面向学术社交网络的大语言模型及智能应用[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 755-764. |
| [11] | 蔡启健, 谭伟. 语义图增强的多模态推荐算法[J]. 《计算机应用》唯一官方网站, 2025, 45(2): 421-427. |
| [12] | 王猛, 张大千, 周冰艳, 马倩影, 吕继东. 基于时序知识图谱补全的CTCS-3级列控车载接口设备故障诊断方法[J]. 《计算机应用》唯一官方网站, 2025, 45(2): 677-684. |
| [13] | 许浩翔, 余敦辉, 邓怡辰, 肖奎. 基于分层强化学习的知识图谱约束问答模型[J]. 《计算机应用》唯一官方网站, 2025, 45(12): 3764-3770. |
| [14] | 杨进才, 班启旭, 杨旭生, 沈显君. 融合外部语义知识的多标签分类方法[J]. 《计算机应用》唯一官方网站, 2025, 45(12): 3757-3763. |
| [15] | 贵慧琳, 岳昆, 段亮. 融合图像与文本信息的多模态知识图谱链接预测方法[J]. 《计算机应用》唯一官方网站, 2025, 45(11): 3540-3546. |
| 阅读次数 | ||||||
|
全文 |
|
|||||
|
摘要 |
|
|||||