The existing Retrieval-Augmented Generation (RAG) question-answering systems in domain-specific applications face challenges such as a single retrieval path, insufficient coverage of users’ implicit intents, and low-quality retrieved segments, resulting in inaccurate and incomplete answers. Therefore, a dual-stage optimization method, Pre-Answering and Retrieval Filtering (PARF), was proposed. Firstly, by integrating domain knowledge graphs and prompt engineering techniques, Large Language Models (LLMs) were guided to generate preliminary answers, thereby constructing a multi-directional retrieval path of “original query → preliminary answer → relevant segments” to expand the semantic space of the original query. Secondly, the retrieved segments were scored and filtered based on the relevance using a BERT (Bidirectional Encoder Representations from Transformers) model, thereby enabling collaborative optimization between the retrieval and generation stages, as well as improving the density of effective information. Experimental results show that compared to the RAG question-answering system constructed by the baseline method DPR-LLM (Dense Passage Retrieval with LLM), the RAG question-answering system constructed by PARF method achieves the improvements of 19.8 and 41.5 percentage points in consistency metrics F1 and ROUGE-L (Recall-Oriented Understudy for Gisting Evaluation-L) score, respectively, on a rail transportation question-answering dataset, the improvements of 16.1 and 17.6 percentage points, respectively, on a medical question-answering dataset; and the correct rates of effectiveness metric increased by 10.2 and 8.8 percentage points.