**Goodness of Fit Test
Exercises**

In this chapter we compare the distribution of a nominal outcome to that of an external standard distribution. The frequency of various
outcomes is tallied with a `FREQ` command. As an example, the frequency of war in the 432 years between 1500 and 1931 is shown below:

Number of Wars Initiated in the Year |
Observed Frequency (o)_{i} |

0 | 223 |

1 | 142 |

2 | 48 |

3 | 15 |

4+ |
4 |

TOTAL |
432 |

(Source: Richardson, 1944).

We want to compare the observed distribution to what would be expected under a random (Poisson) distribution. We calculate the
distribution under the Poisson model (see any standard statistics test for information about how to fit a Poisson distribution) and find
the following expected frequencies:

Number of Wars Initiated in the Year |
Expected Frequency (e_{i}) |

0 | 216.69 |

1 | 149.52 |

2 | 51.58 |

3 | 11.88 |

4+ |
2.38 |

TOTAL |
432.05 |

The expected frequency appears to parallel the observed distribution well. We want to test the fit of the model. This is accomplished
by comparing the observed distribution to a multinomial trial in which there are *n* identical independent trials, with the outcome of
each trial falling into one of *k* categories. The probabilities associated with the *k* categories, denoted *p*_{1}, *p*_{2}, ..., *p _{k}*, are assumed to
remain constant from trial to trial, and the sum of the probabilities is 1. A

The null and alternative hypotheses are:

*H*_{0}: *p _{i}* are those predicted by the multinomial model

The fit can be tested with Pearson's chi-square statistic:

c^{2}_{stat} = S [(*o _{i}* -

where *o _{i}* represents observed frequencies and

The test is right-tailed with *k* - 1 degrees of freedom (where *k* represents the number of categories). The illustrative example has 5 - 1
= 4 degrees of freedom. *Epi Info *will perform this goodness-of-fit test by clicking **Programs > EpiTable > Compare > Proportions
> Goodness of Fit**.

Output is:

`Class Observed Expected`

`---------------------------------------`

`Nº1 223 216.6900`

`Nº2 142 149.5200`

`Nº3 48 51.5800`

`Nº4 15 11.8800`

`Nº5 4 2.3800`

`10.0 % of expected value < 5`

`Chi² 2.73`

`Degrees of freedom 4`

`p-value 0.603482`

Since the observed frequencies closely parallel the expected frequencies, and the goodness of fit test derives *p* = .60, *H*_{0} will be
retained. Wars appear to be randomly distributed over time.

(1) **PRUSSIAN: ***Fatal Horse Kicks in the Prussian Army *(Bortkiewicz, 1898). A famous historical application of the Poisson model
investigated fatal horse kicks in the Prussian army corps in the years between 1875 and 1894. Ten army corps were observed for 20
years (*n* = 200). Data, along with expected probabilities and frequencies under the Poisson model, are shown in the table below.
Determine whether these Poisson probabilities fit the data.

No. of Fatalities (x_{i}) |
Observed Frequency (o_{i}) |
Poisson Probability (p_{i}) |
Expected Frequency (e_{i}) |

0 | 109 | .543351 | 108.68 |

1 | 65 | .331444 | 66.28 |

2 | 22 | .101090 | 20.22 |

3+ |
4 |
.024115 |
4.82 |

TOTAL |
200 |
1.000000 |
200.00 |

**(2) FEV.ZIP**: Gender distribution is an adolescent sample (Rosner, 1990). The file `FEV.REC` contains data from a respiratory health
survey in children and adolescents from East Boston, MA. One of its variables in this data set is `SEX`, coded 0 = female, 1 = male.
Determine the frequency of boys and girls in the sample. If there were an equal number of boys and girls in the population from which
the sample was drawn, how many boys would be expected in the sample? How many girls? Put this data along with the observe
frequencies into an observed and expected frequency table, like the ones used in this chapter. Use a goodness-of-fit test to help
determine whether there is an equal number of boys and girls in the study population.

Bortkiewicz, L von. (1898). *Das Gesetz Der Kleinen Zahlen.* Leipzig: Teubner.

Richardson, L. F. (1944). The Distribution of Wars in Time (in Miscellanea). *Journal of the Royal Statistical Society, B., 107 *(No.
3/4), 242-250.

Rosner, B. (1990). *Fundamentals of Biostatistics* (3rd ed.) Boston: PWS-Kent Publishing.