Journal of Computer Applications ›› 2026, Vol. 46 ›› Issue (1): 60-68.DOI: 10.11772/j.issn.1001-9081.2025010082

• Artificial intelligence • Previous Articles     Next Articles

Zero-shot re-ranking method by large language model with hierarchical filtering and label semantic extension

Xinran XIE1,2, Zhe CUI1,2(), Rui CHEN1,2, Tailai PENG1,2, Dekun LIN1,2   

  1. 1.Chengdu Institute of Computer Application,Chinese Academy of Sciences,Chengdu Sichuan 610213,China
    2.University of Chinese Academy of Sciences,Beijing 100049,China
  • Received:2025-01-21 Revised:2025-03-12 Accepted:2025-03-12 Online:2026-01-10 Published:2026-01-10
  • Contact: Zhe CUI
  • About author:XIE Xinran, born in 1997, Ph. D. candidate. Her research interests include natural language processing, information retrieval.
    CHEN Rui, born in 1995, Ph. D. candidate. Her research interests include text mining.
    PENG Tailai, born in 1996, Ph. D. candidate. His research interests include information retrieval.
    LIN Dekun, born in 1996, Ph. D. candidate. His research interests include long-tailed learning.
  • Supported by:
    Natural Science Foundation of Sichuan Province(2024NSFSC0004)

基于层次过滤与标签语义扩展的大模型零样本重排序方法

谢欣冉1,2, 崔喆1,2(), 陈睿1,2, 彭泰来1,2, 林德坤1,2   

  1. 1.中国科学院 成都计算机应用研究所,成都 610213
    2.中国科学院大学,北京 100049
  • 通讯作者: 崔喆
  • 作者简介:谢欣冉(1997—),女,四川南充人,博士研究生,主要研究方向:自然语言处理、信息检索
    陈睿(1995—),女,四川内江人,博士研究生,主要研究方向:文本挖掘
    彭泰来(1996—),男,四川成都人,博士研究生,主要研究方向:信息检索
    林德坤(1996—),男,福建莆田人,博士研究生,主要研究方向:长尾学习。
  • 基金资助:
    四川省自然科学基金资助项目(2024NSFSC0004)

Abstract:

To address the challenges of insufficient label semantic understanding, vague relationship modeling, and high computational costs of Large Language Models (LLMs) in zero-shot re-ranking tasks, a hierarchical filtering and label semantic extension method named HFLS (Hierarchical Filtering and Label Semantics) was proposed. In the method, by constructing a multi-level label semantic extension path, a progressive prompting strategy “keyword matching → semantic association → domain knowledge integration” was designed to guide LLMs in deep relational reasoning. At the same time, a hierarchical filtering mechanism was introduced to reduce computational complexity while retaining high-potential candidate documents. Experimental results indicate that on seven benchmark datasets such as TREC-DL2019, HFLS achieves average gains of 21.92%, 13.43% and 8.59%, respectively, in NDCG(Normalized Discounted Cumulative Gain)@10 compared to Pointwise methods like Pointwise.qg, Pointwise.yes_no, and Pointwise.3Label. In terms of reasoning efficiency, HFLS has the processing latency per query reduced by 91.06%, 68.87% and 33.54% compared to Listwise, Pairwise, and Setwise methods, respectively.

Key words: Large Language Model (LLM), zero-shot learning, re-ranking, information retrieval, prompt engineering

摘要:

针对大语言模型(LLM)在零样本重排序任务中存在的标签语义理解不足、关系建模模糊和计算成本过高的问题,提出基于层次过滤与标签语义扩展的重排序方法HFLS (Hierarchical Filtering and Label Semantics)。该方法构建多级标签语义扩展路径,并设计“关键词匹配→语义关联→领域知识整合”的递进式提示策略引导LLM实现深度相关性推理;同时,引入分层过滤机制,在降低计算复杂度的同时保留高潜力候选文档。实验结果表明:在TREC-DL2019等7个基准数据集上, HFLS相较于Pointwise.qg、Pointwise.yes_no和Pointwise.3Label等Pointwise方法的NDCG@10(归一化折损累积增益)指标分别平均提升了21.92%、13.43%和8.59%;而在推理效率方面, HFLS的单个查询处理时延较Listwise方法、Pairwise方法和Setwise方法分别降低了91.06%、68.87%和33.54%。

关键词: 大语言模型, 零样本学习, 重排序, 信息检索, 提示工程

CLC Number: