Unit 14 Exercises (Case-Control Studies)

(14.1) STUDY QUESTIONS
(A) In what primary way does a case-control study differ from a cohort study?
(B) Fill in the blanks: Because subjects in case-control studies are selected based on their disease status, we can no longer estimate _______________ directly. However, the _________________ associated with an exposure can still be estimated through an odds ratio.
(C) What symbol is used to denote the odds ratio parameter? What symbol denotes the odds ratio estimator?
(D) The method used to calculate a confidence interval for an odds ratio that is presented in this chapter first has us convert the point estimate for the odds ratio to a ______________ scale.
(E) List the null hypothesis and alternative hypothesis addressed in this chapter.
(F) When is Fisher's test used in place of a chi-square test?
(G) In the 2-by-2 table used to summarize matched-pair case-control data, table cells t and w contain counts of ____________ pairs, while cells u and v contain counts of ___________ pairs.
(H) True or false? In matched case-control studies, information about concordant pairs is largely ignored.
(I) What is the name of the chi-square statistic used to test matched-pair data?

Independent Samples

(14.2) BD1: Esophageal cancer and tobacco. The "BD" in the name of this file is based on the fact that data were initially used in a well-known epidemiology textbook on case-control studies by Breslow and Day. Download the data set bd1.sav from the data directory linked to the StatPrimer page on the web. After downloading the file, click File | Display Data File, and then select the file "bd1.sav." This will display information about the files contents -- an SPSS codebook for the file.

(A) Open the file and cross-tabulate data by the variables ALC and CASE. After you have the 4-by-2 cross-tabulation, split the data (by hand) into three different 2-by-2 tables in which the various levels of alcohol consumption are compared to the low consumption group (i.e., 2x2 Table A: Intermediate alcohol consumption vs. Low alcohol consumption; 2x2 Table B: High alcohol consumption vs. Low alcohol consumption; 2x2 Table C: Very high alcohol consumption vs. low alcohol consumption). Then, calculate separate odds ratios and 95% confidence intervals for each table. Initially, perform these calculations by hand. You may then check your work with EpiCalc 2000 (EpiCalc | Tables | 2-by-2 unstratified), if you wish.
(B) Perform analyses similar to the one described in part A of the problem for the variable
TOB. That is, calculate separate odds ratios and confidence intervals comparing intermediate tobacco use to low tobacco use, high tobacco use to low tobacco use, and very high tobacco use to low tobacco use. Discuss your findings.

(14.3) DOLL1950: An early case-control studies of smoking and lung cancer found 647 of the 649 lung cancer cases were smokers while 622 of the 649 controls were smokers. Display these data in a 2-by-2 table and then calculate the odds ratio associated with smoking. Include a 95% confidence interval for the OR. Interpret your results.

(14.4) BD2: The Oxford Childhood Cancer Survey. The data file BD2 contains data from a case-control study of childhood leukemia and lymphoma and in utero X-ray exposure. Cases are children with leukemia or lymphoma occurring during the period 1954-65 who were less than 10 years of age. Controls are similarly aged children from the neighborhood (CASE: 1 = case, 2 = control). Exposure status is based on whether the mother was exposed to X-rays during pregnancy (XRAY: 1 = yes, 2 = no). Crosstabulate the data and then calculate the odds ratio and 95% confidence for the odds ratio. Then, perform a null hypothesis test on the data. Do data support the hypothesis that X-rays promote childhood leukemias and lymphomas?

(14.5) IUD: Intrauterine Device Use and Infertility (Cramer et al., 1985; Rosner, 1990, p. 381). A study of contraceptive use and infertility found prior use of intra-uterine devices (IUDs) in 89 out of 283 infertile women. In contrast, 640 out of 3833 (fertile) control women used IUDs. Display the data in 2-by-2 format, and calculate routine case-control statistics. Test the data for significance. (Include all hypothesis testing steps.) Interpret your findings.

(14.6) PROSTATE: A case-control study was conducted to help assess the potential relationship between vasectomy and prostate cancer (Zhu et al., 1996). Data were:

VASECTOMY    Cases Controls    TOTAL
Yes            61       93       154
No            114      165       279
TOTAL         175      258       433

Using appropriate methods from the chapter, analyze the data and interpret your findings.

(14.7) ASBESTOS.REC: Data come from a case-control study of lung cancer, asbestos exposure, and cigarette smoking.

(A) Calculate the odds ratio of lung cancer associated with smoking. Include a 95% confidence interval, and interpret your findings.
(B) Calculate the odds ratio of lung cancer associated with asbestos exposure. Include a 95% confidence interval and interpret your findings.
(C) Stratify the data on smoking status and re-analyze the relationship between asbestos and lung cancer.

(14.8) BRAINTUM.REC: Electric blanket use and brain tumors in children. A case-control study by Preston-Martin et al. (1996) was done to assess potential risks for brain tumors in children (BRAINTUM: Y/N). One potential risk factor was electric blanket and water bed heater use (ELECBLANK: Y/N). Data are contained in BRAINTUM.REC. Cross-tabulate the data and calculate the odds ratio estimate and a 95% confidence interval for the OR. Interpret your results. Then test the data for significance.

Matched Samples

(14.9) Rothmanp287: A study by Witte et al. (1996) discussed by Rothman & Greenland (1998, p. 287) used matched-pairs to look at risk factors for adenomatous polyps of the colon. Both cases and controls had undergone sigmoidoscopic screening, with controls matched to cases based on time of screening, clinic, age, and sex. The effects of low fruit and vegetable consumption (defined as two or fewer servings per day) were analyzed. There were 45 (u) pairs in which the case but not the control reported low consumption. There were 24 (v) pairs in which the control but not the case reported low consumption. Calculate the odds ratio associated with low fruit and vegetable consumption. Include a 95% confidence interval for the odds ratio. Then, calculate McNemar's test statistic, and summarize your results.

(14.10) SMOKINGTWINS: when smoking was first suspected of causing lung cancer and heart disease, Sir Ronald Fisher, then the world's greatest living statistician (and a smoker) offered the "constitution hypothesis" that people genetically disposed to these develop the diseases might be more likely to smoke (i.e., genetics was confounding the association). The hypothesis was put to rest in a 1989 study of 22 smoking-discordant monozygotic twins where at least one twin died. In this study, the smoker died first in 17 cases. Calculate the odds ratio of death associated with smoking, and test the association for significance.

(14.11) LILIENFELDp220. Between 1969 and 1971, the Collaborative Group for the study of Stroke in Young Women conducted a case-control study in 12 university hospitals of cerebrovascular disease and oral contraceptive (OC) use in non-pregnant women ages 14- to 44-years. Data were matched according to the age, sex, and race of subjects. Data for thrombotic stroke and hemorrhagic stroke with data for neighborhood controls are:

               Thrombotic Stroke          Hemorrhagic Stroke
               Cntl   Cntl                  Cntl     Cntl  
               OC+     OC-    TOT            OC+      OC-   TOT
Case OC+        2      44      46              5       30    35
Case OC-        5      55      60             13      107   120 
TOT             7      99     106             18      137   155 

(A) Calculate the odds ratios associated with each table. Include 95% confidence intervals. Interpret your findings in each instance.
(B) Later, the following analysis was performed ignoring the paired sampling:

                  Thrombotic Stroke         Hemorrhagic Stroke
OC Use            Cases   Cntl              Cases   Cntrl

User               59       69               44      69      
Nonuser            81      382              152     382
TOT               140      451              196     451    

Calculate the odds ratios for each of the unmatched 2x2 tables How do these compare to the odds ratios calculated in the matched analyses? Which analysis is preferred, and why?

Key to Odd Numbered Problems

Key to Even Numbered Problems (may not be posted)