Semi-supervised learning listwise ranking functions for document retrieval

doi:10.3724/SP.J.1087.2011.03108

Journal of Computer Applications ›› 2011, Vol. 31 ›› Issue (11): 3108-3111.DOI: 10.3724/SP.J.1087.2011.03108

• Artificial intelligence • Previous Articles Next Articles

Semi-supervised learning listwise ranking functions for document retrieval

HE Hai-jiang,LONG Yue-jin

Department of Computer Science and Technology, Changsha University, Changsha Hunan 410003, China

Received:2011-04-25 Revised:2011-07-13 Online:2011-11-16 Published:2011-11-01
Contact: HE Hai-jiang

适应文档检索的半监督多样本排序学习算法

何海江,龙跃进

长沙学院计算机系，长沙 410003

通讯作者: 何海江
作者简介:何海江（1970-），男，湖南望城人，副教授，CCF会员，主要研究方向：机器学习、Web挖掘；
龙跃进（1958-），男，湖南会同人，讲师，主要研究方向：数据挖掘、数据库。
基金资助:
湖南省教育厅科学研究项目

Abstract

Abstract: An iterative co-ranking algorithm, which aimed to extend learning to rank from a supervised setting into a semi-supervised setting, was proposed. The approach employed two listwise rankers to identify document permutations for an unlabeled query. In particular, the use of likelihood listwise loss was introduced to measure the difference score of two learners for a given query. The unlabeled query which showed significant difference score was then chosen for constructing the newly training dataset at next iteration, and its ideal document permutation for a listwise ranker was defined by another learner. The experimental results show that the proposed method can improve the ranking performance of supervised listwise ranking algorithm on the public dataset LETOR. In addition, the labeling ratio was also discussed.

Key words: document retrieval, semi-supervised, rank learning, likelihood loss, co-training

摘要： 针对标记训练集不足的问题，提出了一种协同训练的多样本排序学习算法，从无标签数据挖掘隐含的排序信息。算法使用了两类多样本排序学习机，从当前已有的标记数据集分别构造两个不同的排序函数。相应地，每一个无标签查询都有两个不同的文档排列，由似然损失来计算这两个排列的相似性，为那些文档排列相似度低的查询贴上标签,使两个多样本排序学习机新增了训练数据。在排序学习公开数据集LETOR上的实验结果证实，协同训练的排序算法很有效。另外，还讨论了标注比例对算法的影响。

关键词: 文档检索, 半监督, 排序学习, 似然损失, 协同训练

HE Hai-jiang LONG Yue-jin. Semi-supervised learning listwise ranking functions for document retrieval[J]. Journal of Computer Applications, 2011, 31(11): 3108-3111.

何海江龙跃进. 适应文档检索的半监督多样本排序学习算法[J]. 计算机应用, 2011, 31(11): 3108-3111.

[1]	ZHANG Shipeng, LI Yongzhong, DU Xiangtong. Intrusion detection model based on semi-supervised learning and three-way decision [J]. Journal of Computer Applications, 2021, 41(9): 2602-2608.
[2]	MAO Mingze, CAO Ruihao, YAN Chungang. Semi-supervised classification algorithm based on weight diversity [J]. Journal of Computer Applications, 2021, 41(9): 2473-2480.
[3]	CAO Yuhong, XU Hai, LIU Sun'ao, WANG Zixiao, LI Hongliang. Review of deep learning-based medical image segmentation [J]. Journal of Computer Applications, 2021, 41(8): 2273-2287.
[4]	YAN Haisheng, MA Xinqiang. Feature construction algorithm for multi-target regression via radial basis function [J]. Journal of Computer Applications, 2021, 41(8): 2219-2224.
[5]	OU Lili, SHAO Fengjing, SUN Rencheng, SUI Yi. Cerebral infarction image recognition based on semi-supervised method [J]. Journal of Computer Applications, 2021, 41(4): 1221-1226.
[6]	JIANG Li, HUANG Shijian, YAN Wenjuan. Human action recognition method based on low-rank action information and multi-scale convolutional neural network [J]. Journal of Computer Applications, 2021, 41(3): 721-726.
[7]	LYU Jia, XIAN Yan. Co-training algorithm combining improved density peak clustering and shared subspace [J]. Journal of Computer Applications, 2021, 41(3): 686-693.
[8]	ZHU Yuna, ZHANG Yutao, YAN Shaoge, FAN Yudan, CHEN Hantuo. Protocol identification approach based on semi-supervised subspace clustering [J]. Journal of Computer Applications, 2021, 41(10): 2900-2904.
[9]	HUANG Xiaoxiang, HU Yongmei, WU Dan, REN Lijie. Early identification and prediction of abnormal carotid arteries based on variational autoencoder [J]. Journal of Computer Applications, 2021, 41(10): 3082-3088.
[10]	ZHANG Kailin, YAN Qing, XIA Yi, ZHANG Jun, DING Yun. Semi-supervised hyperspectral image classification based on focal loss [J]. Journal of Computer Applications, 2020, 40(4): 1030-1037.
[11]	LYU Yali, MIAO Junzhong, HU Weixin. Semi-supervised learning algorithm of graph based on label metric learning [J]. Journal of Computer Applications, 2020, 40(12): 3430-3436.
[12]	CHENG Kai, WANG Yan, LIU Jianfei. Semi-supervised learning method for automatic nuclei segmentation using generative adversarial network [J]. Journal of Computer Applications, 2020, 40(10): 2917-2922.
[13]	CAO Yunyang, WANG Tao. Semi-supervised image segmentation based on prior Laplacian coordinates [J]. Journal of Computer Applications, 2019, 39(9): 2695-2700.
[14]	YIN Yu, ZHAN Yongzhao, JIANG Zhen. Semi-supervised ensemble learning for video semantic detection based on pseudo-label confidence selection [J]. Journal of Computer Applications, 2019, 39(8): 2204-2209.
[15]	GONG Yanlu, LYU Jia. Co-training algorithm with combination of active learning and density peak clustering [J]. Journal of Computer Applications, 2019, 39(8): 2297-2301.

Semi-supervised learning listwise ranking functions for document retrieval

适应文档检索的半监督多样本排序学习算法

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics