A reranker, often implemented as a cross-encoder, is a model that takes a query and a candidate result together and scores how well they match.
A reranker solves the “good enough first pass, better final answer” problem.
In search and retrieval systems, you usually start with a fast retriever that finds a few dozen or few hundred candidate documents. That first stage is optimized for speed, not perfect ranking. A reranker then looks at those candidates more carefully and reorders them so the most relevant items rise to the top.
You’d reach for a reranker when:
A cross-encoder reads the query and the candidate text together as one input. For example, it may process:
Because the model sees both texts at once, it can model fine-grained interactions between words in the query and words in the candidate. That is the key difference from bi-encoders / dual-encoders, which encode each side separately.
The model outputs a relevance score, usually a single number. You compute that score for each candidate and sort the candidates by score. The reranker does not usually find documents from the whole corpus; it only reorders the shortlist produced by another system.
This is why cross-encoders are often more accurate than first-stage retrievers, but also slower: they must run once per query-candidate pair.
Suppose your retriever returns these three passages for the query “best way to boil eggs”:
A reranker might score them like this:
So the final order becomes 1, 3, 2.
If you need one sentence: a reranker is the “careful second pass” in a retrieval pipeline, and a cross-encoder is the most common model shape used to do that second pass.