Introduction | Descriptive Statistics | Confidence Interval
| *p *Value | Exercises

This chapter considers the analysis of a continuous outcome with data collected by paired samples. An example of this type of sample is a pre-test/post-test sample in which an outcome is measured before and after an intervention. Paired samples can also be achieved by matching and by non-experimental sequential measurements.

* Illustrative example. *The dataset

`STATE BED80 BED86`

`----- ----- -----`

` 1 4.7 4.2`

` 2 3.9 3.3`

` 3 4.4 4.0
etc.
51 3.1 2.6`

We want to describe changes in the number of hospital beds over the study interval. As usual, we begins with a careful descriptive analysis.

We begin by describing each measurement separately. This is accomplished by `READ`ing the data set into the current session and issuing
separate `MEANS` commands against the
two variables:

`EPI6> READ BED.REC`

`EPI6> MEANS BED80`

`EPI6> MEANS BED86`

Output (not shown to save space) reveals that the mean number of beds (per 1000) decreased from 4.56 in 1980 to 4.23 in 1986.

Next, we create a new variable (`DELTA)
`with the `DEFINE`
command as follows:

`EPI6> DEFINE DELTA ##.#`

`EPI6> DELTA = BED80 - BED86`

When defining `DELTA`, make certain the numeric field indicator is sufficient to capture all possible values for within-pair differences.
For the illustrative example, the variable was defined with the structure ##.# in order to reserve space for the negative sign and decimal
values.

Individual differences are `LIST`ed:

`STATE BED80 BED86 DELTA`

`--- ----- ----- -----`

` 1 4.7 4.2 0.5`

` 2 3.9 3.3 0.6`

` 3 4.4 4.0 0.4`

`etc.`

Summary statistics for `DELTA` are computed:

`EPI6> MEANS DELTA`

Output** **from this command is:

`DELTA`

` Total Sum Mean Variance Std Dev Std Err`

` 51 16 0.324 0.113 0.336 0.047`

` Minimum 25%ile Median 75%ile Maximum Mode`

` -0.800 0.200 0.400 0.500 0.900 0.200`

Therefore, the mean decline is 0.32 (*s _{d}* = 0.34,

Comments

(1) See the prior unit for comments regarding the reporting and interpretation of descriptive statistics.

(2) The above output (derived byEpiInfov.6) has a bug. It reports the minimum as -0.8 when in fact it is -1.0.

The point estimator of the expected difference (µ* _{d}*) is the

(`MEAN _{DELTA}`) ± (

where `MEAN _{DELTA}` = the mean of the

Comments

(1) The above interval locates µ_{d}_{}with 95% confidence.

(2) The width of the confidence interval is a measure of the estimate's precision.

(3) The method assumes data are free of nonrandom error sources of error (i.e., information bias, selection bias, and confounding) and that random error tends toward normality.

To test *H*_{0}: µ_{d}_{}_{ }= 0, use the paired *t* statistic:

*t*_{stat} = (`MEAN _{DELTA}`) / (

Under *H*_{0}, this statistic has a *t* sampling distribution with *n* - 1 degrees of freedom. For the illustrative data, *t*_{stat} = 0.324 / 0.047 = 6.87
and *df* = 51 - 1 = 50. The *p *value is the area in the tail (or tails) of the appropriate *t *distribution.

*Epi Info *computes the paired *t *test when its `MEANS` command is directed against the `DELTA` variable:

`Student's "t", testing whether mean differs from zero.`

`T statistic = 6.872, df = 50 p-value = 0.00000`

Comments

(1) Smallpvalues provide evidence againstH_{0}.

(2) The test assumes data are free of nonrandom error (i.e., information bias, selection bias, and confounding)and the random error distribution ofDELTAtends toward normality (i.e., either data are approximately normal or the sample is large for the central limit theorem to have an effect).

**(1) FLUORIDE.ZIP: **

`REC BEFORE AFTER`

`--- ------ -----`

` 1 18.2 49.2`

` 2 21.9 30.0`

` 3 5.2 16.0`

` 4 20.4 47.8`

` 5 2.8 3.4`

` 6 21.0 16.8`

` 7 11.3 10.7`

` 8 6.1 5.7`

` 9 25.0 23.0`

` 10 13.0 17.0`

` 11 76.0 79.0`

` 12 59.0 66.0`

` 13 25.6 46.8`

` 14 50.4 84.9`

` 15 41.2 65.2`

` 16 21.0 52.0`

**(2) OATBRAN.ZIP**: *Oat Bran and Low Density Lipoproteins* (Data from Pagano and Gauvreau, 1993, pp. 252-253). A study was
conducted to investigate whether oat bran lowers serum cholesterol levels in hypercholesterolemic men. Fourteen individuals were
randomly placed on a diet that included either oat bran or corn flakes. Then, subjects were "crossed-over" to the alternative diet. Data
are shown below. Analyze these data as you see fit.

` REC CORNFLK OATBRAN`

` ---- ------- -------`

` 1 4.61 3.84`

` 2 6.42 5.57`

` 3 5.40 5.85`

` 4 4.54 4.80`

` 5 3.98 3.68`

` 6 3.82 2.96`

` 7 5.01 4.41`

` 8 4.34 3.72`

` 9 3.80 3.49`

` 10 4.56 3.84`

` 11 5.35 5.26`

` 12 3.89 3.73`

` 13 2.25 1.84`

` 14 4.24 4.14`

**(3) COT-NEW.ZIP:** *Degradation of Salivary Cotinine *(Fictitious data). Cotinine is a by-product of tobacco. When found in saliva, it
suggests prior tobacco use or exposure. As part of a study on the use of this methods, volunteers smoked a cigarette. Salivary cotinine
levels were then monitored 12- and 24-hours post-exposure. Data are shown below. Calculate a 95% confidence interval for the
expected change in cotinine levels.

`REC COT12HRS COT24HRS`

`--- -------- ---------`

` 1 83 14`

` 2 68 27`

` 3 68 29`

` 4 98 29`

` 5 30 4`

` 6 14 9`

` 7 141 53`

` 8 54 16`

(4) **BPH-SAMP**: *A minimally invasive therapy for the treatment of Benign Prostatic Hyperplasia* (Data from J. Morales, 2000 SJSU
Graduate student). Benign prostate hyperplasia (BPH) is a noncancerous enlargement of the prostate gland which restricts the flow of
urine from the bladder. The onset of BPH is associated with aging, and is most commonly seen in men over the age of 50. This study
looks at pretreatment quality of life (QoL) and urine flow (MaxFlow) measures at the start of treatment (TX) and 3 months later (3Mo)
in 10 individuals . QoL was ascertained by asking patients to rate their quality of life, the lower the number, the better the quality of
life (0 = Delighted, 1 = Pleased, 2 = Mostly Satisfied, 3 = Mixed, 4 = Mostly Dissatisfied, 5 = Unhappy, 6 = Terrible). The MaxFlow
variable was measured using a uro-flowmeter. A typical "normal" value for MaxFlow is 19.6 ml/s, with low values an indication of
obstruction to the urinary path. After 3 months, follow-up measurements were taken for both outcomes (QoL3Mo and MaxFlow3Mo,
respectively) to see if the procedure had made a positive impact on patients' symptoms. Data for 10 patients participating in the study
are shown below:

ID |
QoLTX |
QoL3Mo |
MaxFlow TX |
MaxFlow3Mo |

001 | 2 | 1 | 7.00 | 5.00 |

011 | 4 | 1 | 8.00 | 18.00 |

021 | 3 | 1 | 8.10 | 13.15 |

031 | 4 | 3 | 8.80 | 15.55 |

041 | 5 | 2 | 11.05 | 8.10 |

051 | 6 | 2 | 3.50 | 8.50 |

061 | 4 | 2 | 9.25 | 12.25 |

071 | 4 | 5 | 9.70 | 5.90 |

081 | 3 | 3 | 8.25 | 13.60 |

082 | 3 | 1 | 10.45 | 13.10 |

(A) Create an Epi Info file with these data.

(B) Describe QoL at the start of treatment.

(C) Describe QoL variable at the 3-month mark.

(D) Describe the change in QoL.

(E) Plot the change in QoL in the form of a stem-and-leaf plot. Interpret your graph.

(G) Test *H*_{0}: µ_{d}_{} = 0.

(H) Analyze the change in MaxFlow using methods you deem appropriate.

**Key**