A Self-Training method for Learning to Rank with Unlabeled Data
Vinh Truong(1), Massih-Reza Amini(2), Patrick Gallinari
(1) Laboratoire d'Informatique Paris 6
(2) National Research Council Canada
104, avenue du président
Kennedy
123,
boulevard Alexandre Taché
75016 Paris
Gatineau, Canada
This paper presents a new algorithl for bipartite ranking functions trained with partially labeled data. The algorithm is an exenstion of the self-training paradigm developed under the classification framework. We further propose an efficient and scalable optimization method for training linear models though the approach is general in the sense that it can be applied to any classes of scroing functions. Empirical resutls on several common image and text corpora over the Area Under the ROC Curve (AUC) and the Average Precision measure show that the use of unlabeled data in the training process leass to improve the performance of baseline supervised ranking functions.