Intuitive explanation of Learning to Rank (and RankNet, LambdaRank and LambdaMART)

Summary

The web content provides an intuitive explanation of Learning to Rank (LTR) algorithms, including RankNet, LambdaRank, and LambdaMART, which are used to optimize the ordering of items in a list, primarily for search engine ranking.

Abstract

Learning to Rank (LTR) is a machine learning approach specifically designed to address ranking problems by ordering a list of items based on their relevance rather than classifying or assigning a numerical score to individual items. This technique is particularly useful in search engine results ranking but can be applied to any scenario requiring a ranked list. The training data for LTR includes a list of items and their corresponding ground truth scores, often determined by human raters. RankNet, LambdaRank, and LambdaMART are LTR algorithms developed by Microsoft Research, with each subsequent algorithm building upon the previous one to improve speed and accuracy. RankNet uses neural networks and minimizes inversions in ranking, LambdaRank focuses on the gradients of the cost function, and LambdaMART combines LambdaRank with Multiple Additive Regression Trees (MART) to enhance performance.

Opinions

The author suggests that traditional machine learning focuses on single-instance prediction, whereas LTR is concerned with the relative ordering of multiple items.
Human rating systems are considered a common method for generating relevance ratings for search engine results.
The author implies that LambdaRank's approach of scaling gradients by the change in NDCG is an improvement over RankNet's cost function.
LambdaMART is presented as the most effective LTR algorithm among the three, based on experimental dataset results.
The author recommends trying out an AI service that offers similar capabilities to ChatGPT Plus (GPT-4) at a more cost-effective price point.

What is Learning to Rank?

Learning to Rank (LTR) is a class of techniques that apply supervised machine learning (ML) to solve ranking problems. The main difference between LTR and traditional supervised ML is this:

Traditional ML solves a prediction problem (classification or regression) on a single instance at a time. E.g. if you are doing spam detection on email, you will look at all the features associated with that email and classify it as spam or not. The aim of traditional ML is to come up with a class (spam or no-spam) or a single numerical score for that instance.

LTR solves a ranking problem on a list of items. The aim of LTR is to come up with optimal ordering of those items. As such, LTR doesn’t care much about the exact score that each item gets, but cares more about the relative ordering among all the items.

The most common application of LTR is search engine ranking, but it’s useful anywhere you need to produce a ranked list of items.

The training data for a LTR model consists of a list of items and a “ground truth” score for each of those items. For search engine ranking, this translates to a list of results for a query and a relevance rating for each of those results with respect to the query. The most common way used by major search engines to generate these relevance ratings is to ask human raters to rate results for a set of queries. In case you are interested, I have written in detail on human rating systems here: Nikhil Dandekar’s answer to How does Google measure the quality of their search results?

For a more technical explanation of Learning to Rank check this paper by Microsoft Research: A Short Introduction to Learning to Rank

What is RankNet, LambdaRank and LambdaMART?

RankNet, LambdaRank and LambdaMART are all LTR algorithms developed by Chris Burges and his colleagues at Microsoft Research. RankNet was the first one to be developed, followed by LambdaRank and then LambdaMART.

In all three techniques, ranking is transformed into a pairwise classification or regression problem. That means you look at pairs of items at a time, come up with the optimal ordering for that pair of items, and then use it to come up with the final ranking for all the results.

Here are some high-level details for each of the algorithms:

RankNet was originally developed using neural nets, but the underlying model can be different and is not constrained to just neural nets. The cost function for RankNet aims to minimize the number of inversions in ranking. Here an inversion means an incorrect order among a pair of results, i.e. when we rank a lower rated result above a higher rated result in a ranked list. RankNet optimizes the cost function using Stochastic Gradient Descent.

Burgess et. al. found that during RankNet training procedure, you don’t need the costs, only need the gradients (λ) of the cost with respect to the model score. You can think of these gradients as little arrows attached to each document in the ranked list, indicating the direction we’d like those documents to move.

Further they found that scaling the gradients by the change in NDCG found by swapping each pair of documents gave good results. The core idea of LambdaRank is to use this new cost function for training a RankNet. On experimental datasets, this shows both speed and accuracy improvements over the original RankNet.

LambdaMART combines LambdaRank and MART (Multiple Additive Regression Trees). While MART uses gradient boosted decision trees for prediction tasks, LambdaMART uses gradient boosted decision trees using a cost function derived from LambdaRank for solving a ranking task. On experimental datasets, LambdaMART has shown better results than LambdaRank and the original RankNet.

If you are interested, Chris Burges has a single paper that details the evolution from RankNet to LambdaRank to LambdaMART here: From RankNet to LambdaRank to LambdaMART: An Overview