Measures of Variability

Key Questions

What is the difference between the population standard deviation and the sample standard deviation?

Answer:

In the formula for a population standard deviation, you divide by the population size $N$ , whereas in the formula for the sample standard deviation, you divide by $n-1$ (the sample size minus one).

Explanation:

If $mu$ is the mean of the population, the formula for the population standard deviation of the population data $x_{1},x_{2},x_{3},\ldots, x_{N}$ is

$sigma=sqrt{\frac{sum_{k=1}^{N}(x_{k}-mu)^{2}}{N}}$ .

If $bar{x}$ is the mean of a sample, the formula for the sample standard deviation of the sample data $x_{1},x_{2},x_{3},\ldots, x_{n}$ is

$s=sqrt{\frac{sum_{k=1}^{n}(x_{k}-bar{x})^{2}}{n-1}}$ .

The reason this is done is somewhat technical. Doing this makes the sample variance $s^{2}$ a so-called unbiased estimator for the population variance $sigma^{2}$ . In effect, if the population size is really large and you are doing many, many random samples of the same size $n$ from that large population, the mean of the many, many values of $s^{2}$ will have an average very close to the value of $sigma^{2}$ (and, as far as a theoretical perspective goes, the mean of $s^{2}$ as a "random variable" will be exactly $sigma^{2}$ ).

The technicalities for why this is true involve lots of algebra with summations, and is usually not worth the time spent for beginning students.

Bill K. · 3 · Jul 22 2015
Of the range and the standard deviation, which is more widely used in statistical analysis, and why?

Standard deviation is most widely used.

Range simply gives the difference between lowest and highest value, and a few extreme values will alter the range excessively.

The standard deviation $sigma$ tells you where most of the values will be, and in a normal distribution 68% of all values will be within one standard deviation from the mean $mu$ , and 95% will be within two standard deviations of the mean.

Example:
You have a filling machine that fills kilogram bags of sugar. It will not fill exactly $1000g$ every time, the standard deviation is $10g$ .
Then you know, that $68%$ is between $990and1010g$ , and $95%$ between $980and1020g$ , a total span of $20g$ or $40g$ respectively.

Every now and again a bag will be far over-filled (say $1100g$ ) and sometimes a bag will end up empty ( $0g$ ), so the range will be a total of $1100g$ .

You may decide which of the two gives a better idea of the spread in this distribution.

MeneerNask · 2 · Feb 9 2015
What do the standard deviation and the range tell you about a data set, as contrasted to what the mean tells you?

SD: it gives you an numerical value about the variation of the data.
Range: it gives you the maximal and minimal values of all data.

Mean: a pontual value that represents the average value of data. Doesn't represent the true in assimetrical distributions and it is influenced by outliers

henriquepizarro · 3 · Feb 24 2015

Questions

Organizing and Summarizing Data

View all chapters

Science

Math

Humanities

... and beyond

Measures of Variability

Key Questions

Answer:

Explanation:

Questions

Organizing and Summarizing Data