Imagine a large forest in which the height of trees are normally
distributed,
with true mean m =
20
metres and standard deviation s
= 10 metres.
Please take a sample of n = 10 trees
so
that we can measure their heights!
The mean height from this sample of trees {lets call it
1
} was 19.6.
Note that it is not surprising that the sample mean was not exactly the population mean. It would have only been guaranteed to be 20 if we took a very large sample.
Please take another sample of n = 10 trees
so that we can measure their heights too!
The mean height from this sample of trees {lets call it
2
} was 21.2. Clearly we are getting some spread in our
sample
means (
1,
2,
3
etc) but will this spread be as great as the spread in the heights
themselves
? What do you think ?
Lets take repeated samples of 10 trees, each time measuring the sample mean height.
After 1000 such samples, the distribution of means is as
follows:

Note: that the distribution of sample means follows a normal distribution. This is consistent with the central limit theorem. Note also that the spread in the sample mean height is less than the spread in the original variate (height). Had the sizes of repeated samples been very large (100 say) then the variation in sample means would have been even less (since each estimate is an even more reliable reflection of the population mean). Indeed, statisticians have shown that the expected standard deviation in the sample means (a term called the standard error) is s /Ön.