**Background | Confidence Interval | ***p *Value | Power and Sample Size | Exercises

In the previous chapter we compared the incidence of disease in an exposed and non-exposed group. In this chapter, we select from a source population people with disease (cases) and without disease ("controls") and then compare their prior exposure experience (case-control sampling). Data from case-control studies, once cross-tabulated, are displayed as follows:

Cases |
Controls |
||

Exposure + |
a | b | n_{1} |

Exposure - |
c | d | n_{2} |

m_{1} |
m_{2} |
N |

The exposure proportion in cases is *p*_{1} = *a* / *m*_{1}
and the exposure proportion in controls is * p*_{2} = *b* / *m*_{2}. `The complement of `*p _{i}* is

` p_{1} / q_{1} (a/m_{1})/(c/m_{1}) a/c ad`

or = ------- = ----------- = ---- = ---

p_{2} / q_{2} (b/m_{2})/(d/m_{2}) b/d bd

It can be shown that the odds ratio from a case-control study is stochastically equivalent to a rate ratio or risk ratio, depending on how cases and controls are sampled from the source population (incidence density vs. risk sampling).

** Illustrative Example **(

** EpiInfo Commands. **To process the data in

`EPI6> TABLES <exposure> <disease>`

where `<exposure>` represents the name of the exposure variable and `<disease>` represents the name of the disease variables.

For our illustrative data, issue the command:

`EPI6> READ BDNEW`

`EPI6> TABLES ALCHIGH CASE`

Data are:

` CASE
ALCHIGH | 1 2 | Total
-----------+-------------+------
1 | 96 109 | 205
2 | 104 666 | 770
-----------+-------------+------
Total | 200 775 | 975 `

From the above table we determine that the exposure proportion in cases (*p*_{1}) = 96 / 200 = 0.480, the exposure proportion in controls
(*p*_{2}) = 109 / 775 = 0.141. The odds ratio = (96)(666) / (109)(104) = 5.64, suggesting high-level alcohol consumers have 5.6 times the
incidence of esophageal cancer as low-level alcohol consumers.

The point estimate and 95% confidence interval for the *OR *are printed below the 2-by-2
table:

` Single Table Analysis`

`Odds ratio 5.64`

`Cornfield 95% confidence limits for OR 3.93 < OR < 8.10`

`Maximum likelihood estimate of OR (MLE) 5.63`

`Exact 95% confidence limits for MLE 3.94 < OR < 8.06`

`Exact 95% Mid-P limits for MLE 3.99 < OR < 7.95`

The standard "cross-product ratio" odds ratio point estimate is printed in line
1 (*or* = 5.64). The standard 95% confidence interval for the *OR* is printed in line
2 (95% CI: 3.93 - 8.10). Maximum likelihood estimates (reported on lines 3 - 5)
are seldom necessary.

*P *value for *H*_{0}: *OR* = 1 are computed with
three different chi-square methods:

` Chi-Squares P-values`

` ----------- --------`

` Uncorrected: 110.26 0.00000000 <---`

` Mantel-Haenszel: 110.14 0.00000000 <---`

` Yates corrected: 108.22 0.00000000 <---`

The statistical power of a study is the probability of correctly rejecting a false *H*_{0} under certain distributional assumptions. The
program *EpiTable *has an excellent power and sample size calculator. In using, *EpiTable *select Sample > Power calculation
> Case-control study. You must then provide assumptions for the number of cases (*m*_{1}), the ratio of controls to cases in the
study (*m*_{2 }/*m*_{1}), an odds ratio "worth detecting" (*OR*), the exposure proportion in controls (*p*_{2}), and the alpha level (a) or confidence
level (1 - a) of the required by the research. Sample size requirements can be determined selecting `EpiTable > Sample >
Sample Size > Case-control study``.`

** Illustrative example. **Suppose we want to detect an

**(1) DOLL1950**: *Smoking and Lung Cancer* (Doll & Hill, 1950). A historically important case-control study of smoking and lung
cancer found 647 of 649 lung cancer cases were smokers. In contrast, 622 of 649 non-cancer controls were smokers. Show these data
in a 2-by-2 table and then, using an epidemiologic calculator, compute the odds ratio and its 95% confidence interval. Interpret your
findings.

**(2) ESOPH_CA.ZIP**: *Esophageal Cancer and Tobacco Consumption* (Tuyns, 1977; Breslow & Day, 1980). Download the data set
`ESOPH_CA`. Then determine the effect of tobacco consumption with alcohol dichotomized at 80 gms/day (`TOB2`) on esophageal
cancer risk (`ESOPH_CA`: 1 = case, 2 = control).` Compute the odds ratio and its 95% confidence interval. Then perform a significance
test and summarize your results in narrative form. `

**(3) ESOPH_CA.ZIP**:* Esophageal Cancer and Age *(Tuyns, 1977; Breslow & Day, 1980). Use the same data set you used in the
previous exercise to determine the effect of age on esophageal cancer risk. The exposure variable is `AGE2` (older = 55+ years, younger
= 35 -54-years). The disease variable is `ESOPH_CA` (1 = case, 2 = control). Compute the odds ratio and its 95% confidence interval.
Interpret your findings.

**(4) BD2.ZIP:** *Breslow & Day 2 *(Stewart & Kneale, 1970; Kneale, 1971; Breslow & Day, 1980, p. 238). Data come from a
case-control study of childhood leukemia and lymphoma and *in utero* exposure to X-rays. Cases are children less than 10 years of age
in England and Wales that occurred during the period 1954-65 (variable `CASE`: 1 = yes, 2 = no). For each case, a neighborhood control
of the same age and year of birth was selected. Exposure status is based on whether mothers were exposed to X-rays during pregnancy
(variable `XRAY`: 1 = yes, 2 = no). Calculate the odds ratio estimate and 95% confidence for the odds ratio. Test the odds ratio for
significance. Narratively interpret your findings.

**(5) IUD: ***Intrauterine Device Use and Infertility *(Cramer et al., 1985; Rosner, 1990, p. 381). A study of contraceptive use and
infertility found prior use of intra-uterine devices (IUDs) in 89 out of 283 infertile women. In contrast, 640 out of 3833 (fertile) control
women used IUDs. Calculate relevant case-control statistics and then summarize your results in plain language.

**(6) PROSTATE.ZIP**: *Vasectomy and Prostate Cancer* (Data source: Zhu et al., 1996). A case-control study was conducted to help
assess the potential relationship between vasectomy and prostate cancer. Calculate the odds ratio and its 95% confidence interval.
Then, determine the sample size required to detect a significant odds ratio of 1.3 with 80% power.

(7) **ASBESTOS.ZIP**: *Asbestos Exposure and Lung Cancer* (Hypothetical data). Data are from a case-control study of lung cancer. The
data set contains information on smoking status (`SMOKE`: Y / N), asbestos exposure (`ASBESTOS`: Y / N), and lung cancer (`LUNGCA`:
Y / N).

(A) Calculate the odds ratio of lung cancer associated with smoking. Include a 95% confidence interval. Interpret your findings.

(B) Calculate the odds ratio of lung cancer associated with asbestos exposure. Include a 95% confidence interval and interpret your
findings.

**(8) BRAINTUM.ZIP**: *Electric blanket use and brain tumors in children* (Preston-Martin et al., 1996). This case-control study
analyzed the relation between brain tumors in children (`BRAINTUM: Y/N)`and exposure to electric blankets and water bed heaters
(`ELECBLANK: Y/N`). Analyze the data and interpret your results. Then, calculate the study's power to uncover odds ratios of (i) 1.1;
(ii) 1.2; (iii) 1.3; (iv) 1.4; (v) 1.5; (vi) 1.6; (vii) 1.7; (viii) 2.0; (ix) Combine your power estimates to form a power curve so that the
*x*-axis represents the expected odds ratio and the *y*-axis represents the study's power. Discuss your power analysis in this light.
Consider at what point the study's power becomes adequate? What can be done to improve this study's power? Would you supplement
the study with additional information?