# What is the difference between the population standard deviation and the sample standard deviation?

Jul 22, 2015

In the formula for a population standard deviation, you divide by the population size $N$, whereas in the formula for the sample standard deviation, you divide by $n - 1$ (the sample size minus one).

#### Explanation:

If $\mu$ is the mean of the population, the formula for the population standard deviation of the population data ${x}_{1} , {x}_{2} , {x}_{3} , \setminus \ldots , {x}_{N}$ is

$\sigma = \sqrt{\setminus \frac{{\sum}_{k = 1}^{N} {\left({x}_{k} - \mu\right)}^{2}}{N}}$.

If $\overline{x}$ is the mean of a sample, the formula for the sample standard deviation of the sample data ${x}_{1} , {x}_{2} , {x}_{3} , \setminus \ldots , {x}_{n}$ is

$s = \sqrt{\setminus \frac{{\sum}_{k = 1}^{n} {\left({x}_{k} - \overline{x}\right)}^{2}}{n - 1}}$.

The reason this is done is somewhat technical. Doing this makes the sample variance ${s}^{2}$ a so-called unbiased estimator for the population variance ${\sigma}^{2}$. In effect, if the population size is really large and you are doing many, many random samples of the same size $n$ from that large population, the mean of the many, many values of ${s}^{2}$ will have an average very close to the value of ${\sigma}^{2}$ (and, as far as a theoretical perspective goes, the mean of ${s}^{2}$ as a "random variable" will be exactly ${\sigma}^{2}$).

The technicalities for why this is true involve lots of algebra with summations, and is usually not worth the time spent for beginning students.