Understanding sampling distributions_ an exploration

Document Sample
Understanding sampling distributions_ an exploration Powered By Docstoc
					           Understanding sampling distributions: an exploration.


1. Click the "Animated sample" button.

 Five scores from a normal distribution will be sampled and plotted in a
histogram. The mean of the sample will be computed and plotted in a
second histogram. Repeat this 3 or 4 times or until you understand the
how the "Distribution of Means" is created.

The red line extends from the mean one standard deviation in each
direction. The colored vertical bars on the X-axis correspond to the
statistic of the same color.

2. Click the "5 samples" button to sample 5 samples of 5 scores each.

The five means will be plotted. Click the "500 samples" and/or "2000
samples" until the distribution of means has stabilized.

The sampling distribution of the mean is the distribution that is
approached as the number of samples approaches infinity. With 5,000
to 10,000 you get a pretty good approximation.

3. The distribution plotted in (2) above is the sampling distribution of
the mean of a sample size of 5. Approximate the sampling distribution
of the mean for other sample sizes.

size 2:

size 10:

size 20:

4. Any statistic you can compute in a sample has a sampling
distribution. Approximate the sampling distribution of other statistics.
The statistics available to compute are:

Mean
Median
Standard deviation (sd) (Using N in the denominator)
Variance (Using N in the denominator)
Mean absolute deviation from the mean (MAD)
Range
Understanding the Standard error
1. The standard error is the standard deviation of the sampling
distribution. Approximate the sampling distribution of the mean for
N=5. The standard deviation of the distribution is the standard error of
the mean. Find the standard error of the mean and the standard error
of the range for N=10 using the normal distribution.

SE mean:

SE range:



2. Determine how the standard error is affected by sample size. Plot
the standard error of the mean as a function of sample size for
different standard deviations? Can you discover a formula relating the
standard error of the mean to the sample size and the standard
deviation? If so, see if it holds for distributions other than the normal
distribution.



3. Redo #2 above for the median.




Understanding Bias
1. A statistic is unbiased if the mean of the sampling distribution of the
statistic is the parameter. Test to see if the sample mean is an
unbiased estimate of the population mean. Try it out different sample
sizes and distributions.



2. You may be surprised to find out that for some populations, taking
the median from a sample is a biased way to estimate the median of
a population! Find a distribution/sample size combination for which
the sample median is a biased estimate of the population median.
Explain how you know the sample median is a biased estimate of the
mean.
3. Is the sample variance an unbiased estimate of the population
variance? If not, see if you can find a correction based on sample size.
Does the correction hold for distributions other than the normal
distribution?




4. For what statistic is the mean of the sampling distribution
dependent on sample size? Do some experimenting.




Understanding Efficiency
1. For a normal distribution, compare the size of the standard error of
the median and the standard error of the mean. Find a relationship
that holds (approximately) across sample sizes.



2. Does this relationship hold for a uniform distribution?




3. Find a distribution for which the standard error of the median is
smaller than the standard error of the mean. (You may find this
difficult, but don't give up.)



4. Compare the standard error of the standard deviation and the
standard error of the mean absolute deviation from the mean (MAD).
Does the relationship depend on the distribution?
Understanding the Central Limit Theorem
1. The central limit theorem states that the sampling distribution of
the mean approaches a normal distribution as the sample size
increases. Sample from the uniform distribution and determine how
large a sample size is needed for the distribution to be a very close
approximation of the normal distribution.



2. Do the same thing sampling from the skewed distribution.



3. Determine whether the sampling distribution of the median
approaches a normal distribution as sample size increases.