**Background | Descriptive Statistics | Confidence Interval | ***p* Value| Sample Size and Precision | Exercises

This chapter teaches you to analyze a * continuous outcome * from a single group. The term
*continuous outcome* as used here denotes any quantitative measure,
including integer, ratio, and ordinal measurements.

No active control group is present. Thus, if comparisons are to be made, they must be in relation to an external "norm" or historical data.

** Illustrative data**: Data in the file

`| 8|8`

`| 9|59`

`|10|0147`

`|11|444679`

`|12|0145`

`|13|`

`|14|`

`|15|2`

`% of ideal body weight(x10)`

The plot reveals that all but one data point lies between 88 and 125 (distributional spread). The center of the distribution is around 110 (central location). The distribution has one high outside value (152). Other than this outside value, data seem to have a negative skew (a tail toward the negative values). As the famous NY Yankee catcher Yogi Berra is rumored to have said, "You can observe a lot by watching."

Each *Epi Info *session begins by `READ`ing (opening) the data set:

`EPI6> READ ONEGRP`

A one-variable `MEANS` command is issued to describe the data:

`EPI6> MEANS PERIDEAL`` `

The following summary statistics are provided:

` Total Sum Mean Variance Std Dev Std Err`

` 18 2030 112.778 208.065 14.424 3.400`

` Minimum 25%ile Median 75%ile Maximum Mode`

` 88.000 101.000 114.000 120.000 152.000 114.000`

Comments:

(1) Always report the distribution's mean and standard deviation. The sample size (reported underTotal) should also be reported.

(2) AlthoughEpi Inforeports summary statistics to three decimal places, fewer decimals should be reported to avoid giving a false impression of precision. A rule-of-thumb is to report summary statistics with one decimal value above that of the initial measurement. For example, since the variable is measured to the nearest whole unit, we would report summary statistics to one decimal place accuracy, e.g., mean = 112.8, standard deviation =14.4 (n= 18).

(3) It is often useful to report afive-point summaryof the distribution comprising the distribution's minimum, 25^{th}percentile, median, 75^{th}percentile, and maximum (e.g., 88, 101, 114, 120, 152).

(4) The mode is seldom of interest with small data sets.

The sample mean is the point estimator of expected value µ. A (1 - a)100% confidence interval for µ is calculated with the formula:

`MEAN` ± (*t _{n}*

where (*t _{n}*

A one-sample *t *statistic is used to test *H*_{0}: µ = µ_{0}, where µ_{0} represents the expected value under the null hypothesis. For our illustrative
example let us ask whether µ differs from 100, since 100 represents 100% of ideal body weight. Therefore, *H*_{0}: µ = 100.

The one-sample *t *statistic is:

*t*_{stat} = (`MEAN` -µ_{0}) / (`Std Err`)

Under the null hypothesis this statistic has a *t* distribution with *n *- 1 degrees of freedom. For the illustrative data, *t*_{stat} = (112.778 -
100) / 3.400 = 3.76 with *df* = 18 - 1 = 17. The two-sided *p *value is the area under the curve in the tails of the *t*_{17} distribution.

To have *Epi Info* calculate one-sample t statistics issue the commands:

`EPI6> DEFINE NULLVAL <###.#>`

`EPI6> LET NULLVAL = <num>`

`EPI6> DELTA = <varname> - NULLVAL`

`EPI6> MEANS DELTA `

The first two lines of this program set the null value for the test. The next line computes differences between observed values and the
null value. The last line calculates the *t *statistics and *p* value.

For the illustrative example the following commands are issued:

`EPI6> DEFINE NULLVAL ###`

`EPI6> LET NULLVAL = 100`

`EPI6> DELTA = PERIDEAL - NULLVAL`

`EPI6> MEANS DELTA `

Relevant output is:

`Student's "t", testing whether mean differs from zero.`

`T statistic = 3.758, df = 17 p-value = 0.00190`

Let *d* represent the margin of error (approximately half the length of the 95% confidence interval). To achieve a study with precision *d
*use a sample of size:

*n* = (4*s*^{2})/*d*^{2}

where *s *represents the standard deviation of the variable. For example, to achieve *d* = 5 for a variable with standard deviation *s* = 15, *n*
= (4)(15^{2})/5^{2} = 36.

Comment: One of the more difficult aspects of using this method is coming up with a reasonable estimate fors. Such estimates may come from a pilot studies or from previous experience.

**(1) UNICEF.ZIP: ***Low Birth Weight Rates Worldwide *(Pagano and Gauvreau, 1993, p. 55; United Nations Children's Fund, 1991). A
weight at birth of less than 2,500 grams -- about 5.5 pounds -- is considered a low birth weight. The rate of low birth-weights in a
county is an index of maternal and child health. The variable `LOWBW` in `UNICEF.REC` contains low birth-weight rates per 100 births
for the year 1991 from various countries.

(A) Sort these data in low birth-weight rate order by issuing the command `SORT LOWBW`. Then list the data to determine which
country demonstrates the lowest low birth-weight rate. Also determine the country with the highest low birth-weight rate.

(B) What is the low birth weight rate in the United States? The easiest way to find this information is to sort data in alphabetic order by
country (`SORT COUNTRY`) and then `LIST` the data to find the record for the United States. Where does the U.S. rank among other
countries? (Issue a `MEANS LOWBW` command and look up the cumulative frequency of the U.S.'s rate. This will represent its
approximate percentile rank.)

(C) Plot the data in the form of a histogram. (*Comment*: The data set is large enough to make grouping it into class intervals
unnecessary.) In words, describe the distribution.

(D) Compute and report summary statistics for `LOWBW`.

(E) Assuming these data represent a random sample of low birth weight rates worldwide, calculate a 95% confidence interval for the
expected low birth weight rate.

**(2) SEIZURE.ZIP: ***Seizures Following Bacterial Meningitis* (Pagano and Gauvreau, p. 54, 1993; Pomeroy et al., 1990). A study
investigated the long-term prognosis of children following bacterial meningitis. This study determined the number of *months *between
the onset of meningitis and subsequent seizures as being: 0.1, 0.25, 0.5, 4, 12, 12, 24, 24, 31, 36, 42, 55, 96.

(A) Create data file with these data. Call the data set SEIZURE.REC. Call the variable MONTHS.

(B) Report the five-point summary for these data (`MEANS MONTHS`).

(C) Group data into class intervals of width 20. Then construct a frequency table based on these groupings.

(D) Construct a histogram based on the 20-unit class intervals.

(E) Previous studies suggest a mean time to seizure of 12 months. Using these data, test whether this mean has changed. In completing
this analysis, list the null and alternative hypotheses, report the *t *statistic, its degrees of freedom, and *p* value. Let a = .05. State your
conclusion.

**(3) SERZINC.ZIP: ***Zinc Levels in 15- to 17-year-old Males* (Pagano and Gauvreau, pp. 32 and 55). The data set `SERZINC.REC`
contains serum zinc values (mcg/dl) for 462 boys between the ages of 15 and 17. *Download* and unzip this data set and then:

(A) compute its mean, standard deviation, and sample size.

(B) Group data into 20 unit class interval widths and then compile a frequency table with this grouped data. Then, create a
`HISTOGRAM` of the grouped data.

(C) Calculate a 95% confidence interval for the population mean.

(D) Test whether the population mean is significantly different from 85 mcg/dl? Let a = 0.05. (List all elements of the hypothesis test.)

Pagano, M. & Gauvreau, K. (1993). *Principles of Biostatistics*. Belmont, CA: Duxbury Press.

Pomeroy, S. L., Holmes, S. J., Dodge, P. R., and Feigin, R. D. (1990). Seizures and other neurolotic sequelae of bacterial meningitis in
children. *New England Journal of Medicine*, 323, 1651-1656.

Saudek, C. D., Selam, J. L., Pitt, H. A., Waxman, K., Rubio, M. Jeandidier, N., Turner, D., Fishcell, R. E., and Charles, M. A. (1989).
A preliminary trial of the programmable implantable mediation system for insulin delivery. *New England Journal of Medicine*, 321,
574-579.

United Nations Children's Fund. (1991). *The State of the World's Children, 1991*. New York: Oxford University Press.