Question #dec54

1 Answer
Jun 26, 2016

Approximately, arithmetic mean equals to #32.4#.

Explanation:

First, the theory behind the method.

If a random variable #xi# takes values #x_1#, #x_2#, ...#x_n# with corresponding probabilities #p_1#, #p_2#, ...#p_n#, its Mathematical Expectation or mean, by definition, equals to
#E(xi) = x_1*p_1+x_2*p_2+...+x_n*p_n = Sigma_(i in [1;n])(x_i*p_i)#

Assume, numbers #x_i# are large enough to make this calculation inconvenient. We can always transform this formula using two constants #a# and #h# freely chosen to our liking into (assuming summation #Sigma# is performed for all #i# from #1# to #n#):
#E(xi) = Sigma (x_i*p_i) = h*Sigma ((x_i-a)/h*p_i) + Sigma (a*p_i) = #
#= h*Sigma ((x_i-a)/h*p_i) + a*Sigma (p_i) =#
#= h*Sigma ((x_i-a)/h*p_i) + a#
(since #Sigma (p_i) = 1#)

Now it's up to us to choose constant #a# and #h# in such a way that simplifies the calculations as much as possible.

If values #x_i# that our random variable #xi# takes are distributed with equal intervals (steps), the best results are achieved if #a# is chosen approximately in the middle of these numbers and #h# is the step.

For example, if values #x_i# are #10000, 20000, 30000, 40000, 50000#, choosing #a=30000# and #h=10000# results in #(x_i-a)/h# to be #-2,-1,0,1,2#, which is a much easier to deal with than the original very large numbers.

Addressing our problem, we will have values #x_i# chosen as midpoint of each interval:
interval #[12.5-17.5]# has midpoint at #15#
interval #[17.5-22.5]# has midpoint at #20#
interval #[22.5-27.5]# has midpoint at #25#
interval #[27.5-32.5]# has midpoint at #30#
interval #[32.5-37.5]# has midpoint at #35#
interval #[37.5-42.5]# has midpoint at #40#
interval #[42.5-47.5]# has midpoint at #45#
interval #[47.5-52.5]# has midpoint at #50#

For number #a# we can choose #30# since it's somewhere in the middle of a group of interval midpoints, for number #h# we can choose #5# since it is, obviously, an increment from one value to another.

Probabilities our random variable takes the above values are approximated by real frequencies of taking these values. Each such frequency is a ratio of the number of times this value occurred (#4# for the first interval, #20# - for the second etc.) divided by the total number of experiments #N=4+20+17+15+2+5+5+2=70#.

Now the mean value of our random variable is evaluated as
#E(xi) = 5*((15-30)/5*4/70+(20-30)/5*20/70+#
#+(25-30)/5*17/70+(30-30)/5*15/70+#
#+(35-30)/5*2/70+(40-30)/5*5/70+#
#+(45-30)/5*5/70+(50-30)/5*2/70) + 30 =#
#= 5/70*((-3)*4+(-2)*20+(-1)*17+0*15+#
#+1*2+2*5+3*5+4*2) + 30 = #
#5/70*(-12-40-17+0+2+10+15+8)+30=#
#=1/14*(-34)+30=32.428571...#

In this case it seems sufficient to approximate the mean as #32.4#.