PaPoo
cover

What is hybrid search (BM25 + vector)?

Hybrid search is a retrieval method that combines keyword search with vector similarity search so you can find documents that match both the exact words a user typed and the broader meaning of the query.

Why it matters

Pure keyword search is good when the exact term matters. Pure vector search is good when wording varies but meaning is similar. In practice, many real queries need both.

You reach for hybrid search when:

A common pattern is search over docs, help centers, knowledge bases, and enterprise content.

How it works

Hybrid search usually runs two retrieval methods in parallel:

  1. BM25 scores documents by matching query words against document terms. It favors documents with the same keywords, adjusted for term frequency and document length.
  2. Vector search embeds the query and documents into vectors and retrieves documents with similar meaning, even when the words differ.

The system then combines the results. The combination can happen in a few ways:

There is no single universal formula for hybrid search. The important idea is that it blends lexical matching and semantic matching. That makes it more robust than either method alone on messy, natural-language search.

Tiny concrete example

A user searches for:

“how do I reset MFA after phone loss”

A hybrid retriever can return both kinds of documents and rank the most useful ones near the top.

Common pitfalls / when NOT to use it

Related terms

同じ著者の記事