& Tornado Alley
as a Function of Sample Size
The purpose of this page is to illustrate what happens to the distribution of sample medians as the size of the samples increases. This purpose is accomplished by drawing 2000 samples, computing their medians and constructing the histogram of those sample medians. (Each time the screen is refreshed a new batch of 2000 samples is created.)
Let p(x) be the probability density function for a random variable x. The median of this distribution is denoted as ν. The value of the median ν is defined as the value such that are equal probabilities of getting a larger value and getting a smaller value than ν
Below are shown the histograms for samples of various sizes.
The random variable is uniformly distributed from -0.5 to +0.5; i.e.,
The value of ν for this distribution is 0.
For a sample of size n the median is found by ranking the sample values. For n odd the median is the value in the (n+1)/2 place in the ranking. For n even the median is taken to be the average of the values at (n-1)/2 and (n+1)/2 places in the ranking. Thus for n=3 the second value in the ranking is taken. For n=4 the average of the second and third in the ranking is taken.
The probability density function for the median for an even n is
where the factor of A is to take into account the number of ways the n/2 values above x and the n/2 values below x can be taken.
For the uniform distribution p(x), the distribution of sample median reduces to
To check that q(x) is a proper probability density function consider its integration over the interval [-0.5,x];i.e.,
Although the distribution of sample medians appears to get closer that of a normal distribution for larger n there is not the same analytical justification as there is for sample means.
HOME PAGE OF Thayer Watkins