The lost-in-the-middle problem is the tendency of a language model to miss information that appears in the middle of a long context, even when that information is relevant to the answer.
This matters any time you stuff a lot of text into an LLM prompt: long documents, chat histories, codebases, retrieval-augmented generation (RAG), or multi-step agent traces.
In practice, it means:
If you build search, summarization, customer support, or agent systems, this is one of the main reasons long-context quality can disappoint even when the model technically supports huge windows.
The term comes from empirical studies of long-context behavior, especially work showing that accuracy can drop when the key evidence is placed in the middle of the input rather than near the beginning or end.
A useful mental model is:
This is not a hard rule of all models in all settings, and the exact shape of the effect varies by architecture, training, and prompt design. But the broad phenomenon is well documented enough that teams should assume it can happen.
Suppose you pass a long policy document to an LLM and ask:
“According to this document, who approves refund exceptions?”
The relevant sentence is in the middle:
“Refund exceptions must be approved by the Finance Director.”
If the rest of the document is long enough, the model may answer with a guess from the introduction or conclusion instead of extracting that middle sentence.
A simple mitigation is to move the key evidence closer to the question, for example by extracting the relevant passage first and then asking the model to answer from that passage.
In practice, teams usually mitigate this by chunking, reranking, extracting the relevant spans, or structuring prompts so the most important information is not stranded in the middle.