PaPoo
cover

What is a parameter (model size)?

A parameter is a learned number inside a model, and “model size” usually means how many of those learned numbers the model has.

Why it matters

Parameters are the part of a neural network that gets updated during training, so they determine how much capacity the model has to fit patterns in data.

You’ll care about parameter count when you want to compare models, estimate training cost, understand memory use, or get a rough sense of how much hardware a model needs. In practice, “bigger” often means more capable, but only up to a point: data quality, training method, and architecture matter too.

How it works

A neural network is built from layers of simple mathematical operations. Each layer typically contains weights and often biases; these are the parameters.

During training, the model makes a prediction, measures how wrong it was, and then adjusts those parameters slightly. Repeating that process across lots of data is how the model “learns.”

When people say a model has “7B parameters” or “70B parameters,” they mean it has roughly 7 billion or 70 billion learned values. That count is a proxy for model capacity, but it is not a complete measure of quality.

A larger parameter count usually means:

Tiny concrete example

Suppose a tiny linear model predicts house prices:

price = w1 * size + w2 * bedrooms + b

Here, w1, w2, and b are parameters. Training adjusts those three numbers so the prediction gets closer to real prices.

A large language model works the same way in spirit, except it has many layers and billions of such learned numbers.

Common pitfalls / when NOT to use it

Related terms

Related terms

同じ著者の記事