Home |

Choosing a discrete or continuous probability distribution for representing a process depends on prior knowledge (however scanty) of the physical system that we assume underlies the process. We have many choices available, of course, and there is not room to discuss them here. However, the widespread use of the normal distribution in statistical inference is due to the fact that the sampling distribution of means tends to be normal. More precisely, the central limit theorem states that as the number of independent, identically distributed random variables with finite variance increases, the distribution of their mean becomes increasingly normal. Furthermore, the variance of the mean decreases proportionally to the sample size. We call the square root of the variance of the mean the standard error of the mean.

There are several things to try in this applet:

1) Notice how the sampling distribution of the mean (blue curve) compares to the population distribution (red curve). The population distribution has a larger standard deviation by a factor of &radic

2) Pick the arcsine distribution and set the left slider to 80 or 100. Notice how samples from an extremely non-normal distribution can have means that are fairly normally distributed when

3) Pick the binomial distribution and run the left slider from 0 to 100. Notice how the histogram of means tends toward the normal when the slider is near 100. When

4) Pick the exponential distribution. Notice how a skewed distribution like this produces a skewed distribution of sample means (when you set the left slider well below 20). It takes larger sample sizes to rub out skewness.

5) Set both sliders to the far right and run through the distributions at the top. Notice how small the standard error of the mean is for samples from the normal population. The Central Limit Theorem will help you the most if your data are normal to begin with.

6) If you want to see the two theoretical distributions without any sample data, just set the right slider to zero. You can then move the left slider to see how the sampling distribution of means changes with

7) Try clicking repeatedly on the distribution buttons at the top of the display. Each click will generate a new sample, so you can get an idea of the variability across samples for a given sample size and number of samples.