PaPoo
cover

What is temperature in an LLM?

Temperature is a sampling setting that controls how random or deterministic an LLM’s next-word choices are when it generates text.

Why it matters

Temperature is one of the simplest ways to trade off consistency versus variety.

In practice, many teams start with a low temperature for production workflows and raise it only when they explicitly want more variation.

How it works

An LLM produces a probability distribution over possible next tokens. Temperature changes how sharply or evenly that distribution is sampled.

A useful mental model: temperature does not change what the model “knows”; it changes how adventurous the generator is when choosing among candidate continuations.

Tiny concrete example

Suppose the model is considering the next word after:

“The best way to explain temperature is…”

At low temperature, it might repeatedly pick the most likely continuation, such as:

“to think of it as a randomness control.”

At higher temperature, it may more often choose alternate phrasings, such as:

“to treat it as a knob for variety.”

“to view it as a sampling dial.”

Same underlying model, different sampling behavior.

Common pitfalls / when NOT to use it

In short: if you want steadier outputs, lower it; if you want more variety, raise it.

Related terms

同じ著者の記事