A vector index is a data structure that helps a system quickly find the most similar embeddings or vectors to a query vector; it is also commonly called an ANN index, short for approximate nearest neighbor index.
If you store thousands, millions, or billions of vectors and want “find the closest ones,” a brute-force scan is usually too slow. A vector index lets you trade a little exactness for much faster retrieval, which is why it shows up in search, retrieval-augmented generation (RAG), recommendations, deduplication, and semantic lookup.
In practice, most teams reach for a vector index when:
The core idea is to avoid comparing the query vector against every stored vector.
HNSW (Hierarchical Navigable Small World) builds a graph where nearby vectors are connected. At query time, the search walks the graph from entry points toward promising neighbors instead of checking everything. This is usually very fast and often strong in recall, which is why HNSW is a common default in vector databases.
IVF (Inverted File Index) first groups vectors into coarse clusters, then stores vectors by cluster. At query time, the system finds the nearest cluster centers and only searches inside a small subset of clusters. This reduces work dramatically, especially at larger scales. A common variant combines IVF with a compression method such as product quantization, but IVF by itself already means “search within selected buckets.”
Both approaches are approximate nearest neighbor methods: they are designed to return very close results quickly, not necessarily the exact top-k neighbors every time.
Suppose you have 1 million product embeddings and a user searches for “lightweight running shoes.”
Without a vector index, the system would compare the query against all 1 million vectors.
In short: use a vector index when similarity search is central and latency matters, but don’t assume it is always the best first tool.