Self-consistency is a way to get a language model to answer by generating several reasoning paths and then choosing the answer that appears most often, instead of trusting a single chain of thought.
A single reasoning trace from an LLM can be brittle: one unlucky step can derail the final answer. Self-consistency helps when the task needs multi-step reasoning, such as math word problems, logic puzzles, or multi-hop question answering.
In practice, you reach for it when you want a cheap reliability boost without changing the model itself. It is especially useful when the model can produce different valid reasoning paths that still lead to the same final answer.
The basic idea is simple:
This is usually described in the paper “Self-Consistency Improves Chain of Thought Reasoning in Language Models” (Wang et al., 2022). The intuition is that if many different reasoning paths converge on the same result, that result is more likely to be correct than the answer from just one sampled path.
It is not the same as majority voting over arbitrary text. The voting is typically over the model’s final answer after reasoning, which matters because the reasoning text itself can vary a lot.
Suppose you ask:
If a book costs $12 and you get a 25% discount, what is the final price?
You sample the model 5 times. The reasoning may differ, but the final answers come back as:
Self-consistency returns $9, because that is the most common final answer.
If you want one practical rule: start with a single well-prompted run, then add self-consistency when accuracy matters more than cost.
Self-consistency is a way to get a language model to answer by generating several reasoning paths and then choosing the answer that appears most often, instead of trusting a single chain of thought.
A single reasoning trace from an LLM can be brittle: one unlucky step can derail the final answer. Self-consistency helps when the task needs multi-step reasoning, such as math word problems, logic puzzles, or multi-hop question answering.
In practice, you reach for it when you want a cheap reliability boost without changing the model itself. It is especially useful when the model can produce different valid reasoning paths that still lead to the same final answer.
The basic idea is simple:
This is usually described in the paper “Self-Consistency Improves Chain of Thought Reasoning in Language Models” (Wang et al., 2022). The intuition is that if many different reasoning paths converge on the same result, that result is more likely to be correct than the answer from just one sampled path.
It is not the same as majority voting over arbitrary text. The voting is typically over the model’s final answer after reasoning, which matters because the reasoning text itself can vary a lot.
Suppose you ask:
If a book costs $12 and you get a 25% discount, what is the final price?
You sample the model 5 times. The reasoning may differ, but the final answers come back as:
Self-consistency returns $9, because that is the most common final answer.
If you want one practical rule: start with a single well-prompted run, then add self-consistency when accuracy matters more than cost.