Journal of Computer Applications

    Next Articles

Retrieval-augmented generation framework integrating knowledge graph evolution and hybrid retrieval

  

  • Received:2026-01-12 Revised:2026-04-08 Online:2026-05-11 Published:2026-05-11

融合知识图谱演化与混合检索的检索增强生成框架

陈鹏1,卢孟龙2,铁俊波2,王永文2,荀长庆2,罗莉2,潘国腾3,周海亮2   

  1. 1. 长沙理工大学 计算机与通信工程学院
    2. 国防科技大学
    3. 国防科技大学计算机学院
  • 通讯作者: 卢孟龙
  • 基金资助:
    基于大模型的鲁棒实时谣言检测关键技术研究

Abstract: To address the issues of current Retrieval-Augmented Generation (RAG) lacking structured knowledge modeling in complex multi-hop reasoning and the static nature of knowledge bases, a Retrieval-Augmented Generation framework integrating Knowledge Graph evolution and Hybrid retrieval (DKG-RAG) was proposed. First, an automated Knowledge Graph (KG) driven by Large Language Model (LLM) was constructed, employing an overlapping chunking strategy and a dual-channel vectorization mechanism to achieve structured knowledge extraction and semantic enhancement. Then, a quadruple fusion retrieval architecture was designed, integrating dynamic combinations of query analysis, hierarchical graph retrieval, and semantic vector retrieval. Finally, an outer-loop evolution mechanism was introduced to detect knowledge gaps in real-time and trigger incremental updates to the graph through a dual-layer quality-aware mechanism, and a FAISS (Facebook AI Similarity Search)-based dense retrieval fallback module was incorporated to ensure system robustness. Experimental results show that compared to the best-performing baseline Hypothetical Document Embeddings (HyDE), the proposed framework improves the F1 score on the MuSiQue dataset from 20.5% to 30.28%, an increase of 9.78 percentage points; on the 2WikiMQA dataset from 36.8% to 46.84%, an increase of 10.04 percentage points; and on the HotpotQA dataset from 42.7% to 46.27%, an increase of 3.57 percentage points. Through dynamic evolution and hybrid retrieval mechanisms, information retrieval and integration capabilities in complex multi-hop question answering are significantly enhanced, providing an interpretable technical path and a reusable engineering paradigm for building adaptive knowledge-augmented question answering systems.

Key words: Retrieval-Augmented Generation (RAG), Natural Language Processing (NLP), Large Language Model (LLM), Knowledge Graph (KG), cognitive boundary

摘要: 针对当前检索增强生成(RAG)在复杂多跳推理中缺乏结构化知识建模且知识库静态固化的问题,提出一种融合知识图谱演化与混合检索的检索增强生成框架(DKG-RAG)。首先,构建大语言模型(LLM)驱动的自动化知识图谱(KG),采用重叠分块策略与双通道向量化机制实现结构化知识抽取与语义增强;其次,设计四元融合检索架构,集成查询分析、分层图检索与语义向量检索的动态组合;最后,引入外层循环演化机制,通过双层质量感知机制实时检测知识缺口并触发图谱增量更新,再配合基于FAISS(Facebook AI Similarity Search)的稠密检索回退模块确保系统鲁棒性。实验结果表明,所提框架相较于表现最优的基线假设文档嵌入(HyDE),在MuSiQue数据集上F1分数从20.5%提升至30.28%,提升了9.78个百分点;在2WikiMQA数据集上F1分数从36.8%提升至46.84%,提升了10.04个百分点;在HotpotQA数据集上F1分数从42.7%提升至46.27%,提升了3.57个百分点。该框架通过动态演化与混合检索机制,显著提升了复杂多跳问答中的信息检索与整合能力,为构建自适应知识增强型问答系统提供了可解释的技术路径和可复用的工程范式。

关键词: 检索增强生成, 自然语言处理, 大语言模型, 知识图谱, 认知边界

CLC Number: