# How do you use a probability mass function to calculate the mean and variance of a discrete distribution?

Mar 4, 2017

PMF for discrete random variable $X : \text{ }$ ${p}_{X} \left(x\right) \text{ }$ or $\text{ } p \left(x\right)$.
Mean: $\text{ } \mu = E \left[X\right] = {\sum}_{x} x \cdot p \left(x\right)$.
Variance: $\text{ "sigma^2 = "Var} \left[X\right] = {\sum}_{x} \left[{x}^{2} \cdot p \left(x\right)\right] - {\left[{\sum}_{x} x \cdot p \left(x\right)\right]}^{2}$.

#### Explanation:

The probability mass function (or pmf, for short) is a mapping, that takes all the possible discrete values a random variable could take on, and maps them to their probabilities. Quick example: if $X$ is the result of a single dice roll, then $X$ could take on the values $\left\{1 , 2 , 3 , 4 , 5 , 6\right\} ,$ each with equal probability $\frac{1}{6}$. The pmf for $X$ would be:

${p}_{X} \left(x\right) = \left\{\left(\frac{1}{6} \text{,", x in {1,2,3,4,5,6}),(0",","otherwise}\right)\right.$

If we're only working with one random variable, the subscript $X$ is often left out, so we write the pmf as $p \left(x\right)$.

In short: $p \left(x\right)$ is equal to $P \left(X = x\right)$.

The mean $\mu$ (or expected value $E \left[X\right]$) of a random variable $X$ is the sum of the weighted possible values for $X$; weighted, that is, by their respective probabilities. If $S$ is the set of all possible values for $X$, then the formula for the mean is:

$\mu = {\sum}_{x \in S} x \cdot p \left(x\right)$.

In our example from above, this works out to be

$\mu = {\sum}_{x = 1}^{6} x \cdot p \left(x\right)$
$\textcolor{w h i t e}{\mu} = 1 \left(\frac{1}{6}\right) + 2 \left(\frac{1}{6}\right) + 3 \left(\frac{1}{6}\right) + \ldots + 6 \left(\frac{1}{6}\right)$
$\textcolor{w h i t e}{\mu} = \frac{1}{6} \left(1 + 2 + 3 + 4 + 5 + 6\right)$
$\textcolor{w h i t e}{\mu} = \frac{1}{6} \left(21\right)$

$\textcolor{w h i t e}{\mu} = 3.5$

The variance ${\sigma}^{2}$ (or $\text{Var} \left[X\right]$) of a random variable $X$ is a measure of the spread of the possible values. By definition, it is the expected value of the squared distance between $X$ and $\mu$:

${\sigma}^{2} = E \left[{\left(X - \mu\right)}^{2}\right]$

With some simple algebra and probability theory, this becomes

${\sigma}^{2} = E \left[{X}^{2}\right] - {\mu}^{2}$

We already have a formula for $\mu \text{ } \left(E \left[X\right]\right) ,$ so now we just need a formula for $E \left[{X}^{2}\right] .$ This is the expected value of the squared random variable, so our formula for this is the sum of the squared possible values for $X$, again, weighted by the probabilities of the $x$-values:

$E \left[{X}^{2}\right] = {\sum}_{x \in S} {x}^{2} \cdot p \left(x\right)$

Using this, our formula for the variance of $X$ becomes

${\sigma}^{2} = {\sum}_{x \in S} \left[{x}^{2} \cdot p \left(x\right)\right] - {\mu}^{2}$
$\textcolor{w h i t e}{{\sigma}^{2}} = {\sum}_{x \in S} \left[{x}^{2} \cdot p \left(x\right)\right] - {\left[{\sum}_{x \in S} x \cdot p \left(x\right)\right]}^{2}$

For our example, $\mu$ was calculated to be $3.5 ,$ so we use that for our last term to get

${\sigma}^{2} = {\sum}_{x = 1}^{6} \left[{x}^{2} \cdot p \left(x\right)\right] - {\mu}^{2}$
$\textcolor{w h i t e}{{\sigma}^{2}} = \left[{1}^{2} \left(\frac{1}{6}\right) + {2}^{2} \left(\frac{1}{6}\right) + \ldots + {6}^{2} \left(\frac{1}{6}\right)\right] - {\left(3.5\right)}^{2}$
$\textcolor{w h i t e}{{\sigma}^{2}} = \frac{1}{6} \left(1 + 4 + 9 + 16 + 25 + 36\right) \text{ "-" } {\left(3.5\right)}^{2}$
$\textcolor{w h i t e}{{\sigma}^{2}} = \frac{1}{6} \left(91\right) \text{ "-" } 12.25$
$\textcolor{w h i t e}{{\sigma}^{2}} \approx 15.167 - 12.25$
$\textcolor{w h i t e}{{\sigma}^{2}} = 2.917$